Monitoring
Monitoring has many faces; Logs, Metrics and Traces being the main categories. For this we lean on open source projects, with all of our current used projects being open source projects from Grafana.
In all of our clusters we run Grafana Alloy that has been packaged to run in a kubernetes environment from Grafana called k8s-monitoring.
It is highly reccommended to aggregate all traces, logs and metrics in a cluster centrally in a cluster before forwarding them onto the desired destination. This allows for the updating of credentials of the desired backends to be managed much more efficiently and easily.
Pod Logs
We capture all pod logs from the kubernetes cluster for applications that we control and own. We drop all other pod logs to our own logging backend.
There is a choice of running your own log scraping setup or we can enable forwarding of pod logs to a location of your choice if your backend is Loki or using the OTLP protocol. If running your log backend inside the infrastructure is appealing, we would suggest running Loki inside the management cluster. We can then forward the pod logs to the centralised Loki in the management cluster.
Metrics
We capture metrics from multiple sources inside the cluster. We capture cluster metrics, Pod Monitors, Service Monitors and Probes. Just as with Pod Logs, we drop all Service Monitor, Pod Monitor and Probe metrics to our backend.
Customers can choose how they would like to run their own metrics stack. It is possible to own and run metrics scrapers from the customer namespaces. Alternatively, we can forward any metrics scraped to a prometheus compatible service. Running a prometheus compatible deployment inside the management cluster of the setup is preferred solution for customers who wish to host their own metrics setup inside the clusters.
Traces
Traces can be captured and forwarded onto a service that supports the OTLP protocol. We don't collect traces for applications running in our clusters as we don't have a need for this in the current environment. Traces can be collected by running an Open Telemetry Collector sidecar container, forwarded to Alloy which will in turn forward those traces to a destination of the customers choosing that supports the OTLP protocol.
Last updated
Was this helpful?
