Metrics collection
Metrics collection
The platform uses Google Managed Prometheus which comes with a scalable backend prometheus storage and metrics collectors that scrape exposed metrics endpoints such as kubelet/cadvisor and kube state metrics via CRDs. CRDs are defined here: https://github.com/GoogleCloudPlatform/prometheus-engine/blob/v0.7.4/doc/api.md
The GMP operator runs as a deployment
kubectl -n gmp-system get deploy gmp-operator
Kube state metrics - docs
Generates metrics from a wide range of Kubernetes objects. These can be used to assess the health of your pods, deployment, jobs and many other Kubernetes objects.
They generally start with kube_
.
It runs as a deployment:
kubectl -n gmp-public get deploy kube-state-metrics
Note that GMP re-labels namespace
to exported_namespace
as it reserves namespace for the namespace of the pod that
the metric is scraped from. When importing dashboards that rely on kube-state-metrics
metrics, the queries must use exported_namespace
.
cadvisor - docs
Collects metrics for containers running on the node ; it runs alongside kubelet on each node. Typical metrics include cpu, memory, I/O usage which can be used to diagnose performance issues.
They generally start with container_
kubelet - docs
kubelet is the agent running on the node that is responsible to ensure containers are running and healthy. Collected metrics can be used to identify pod start duration, the number of pods and containers on the node and other information about the node, such as status
Blackbox exporter - docs
This is used to probe key endpoints on or outside the platform, so we can monitor uptime and SSL expiry of components with TLS termination. It runs as a deployment:
kubectl -n platform-monitoring get deploy prometheus-blackbox-exporter
Node exporter - docs
Collects hardware and OS-level system metrics exposed on the node. Metrics include host memory, cpu, disk and network. It runs as a daemonset:
kubectl -n gmp-public get ds node-exporter