The usage-metrics-collector is a Prometheus metrics collector optimized for collecting kube usage and capacity metrics.
Why not just use promql and recording rules?
- Scale
- Aggregate at collection time to reduce prometheus work
- Export aggregated metrics without the raw metrics to reduce prometheus storage
- Insight
- Join labels at collection time (e.g. set the priority class on pod metrics)
- Set hard to resolve labels (e.g. set the workload kind on pod metrics)
- View node-level cgroup utilization (e.g. kubepods vs system.slice metrics)
- Fidelity
- Scrape utilization at 1s intervals as raw metrics
- Perform aggregations on the 1s interval metrics (e.g. get the p95 1s utilization sample for all replicas of a workload)
A sample of the exposed metrics is available in METRICS.md.
In addition to these metrics, a series of performance related metrics are published for the collection process. These metrics are documented in performance analysis document.
Note: No usage-metrics-collector container image is publicly hosted. Folks will need to build and publish this own until this is resolved.
Note: only cgroups v1 are currently supported.
- Create a kind cluster
kind create cluster
- Build the image
docker build . -t usage-metrics-collector:v0.0.0
- Load the image into kind
kind load docker-image usage-metrics-collector:v0.0.0
- Make sure the
Kind cluster values
config portion is uncommented in config/metrics-prometheus-collector/configmaps/sampler.yaml - Install the config
kustomize build config | kubectl apply -f -
- Update your context to use the usage-metrics-collector namespace by default
kubectl config set-context --current --namespace=usage-metrics-collector
Note: Only cgroups v1 is supported for utilization right now. GKE clusters 1.26+ use cgroups v2 by default.
- Build the image
docker build . -t my-org/usage-metrics-collector:v0.0.0
- Push the image to a container repo
docker push my-org/usage-metrics-collector:v0.0.0
- Make sure the
GKE cluster values
config portion is uncommented in config/metrics-prometheus-collector/configmaps/sampler.yaml
- Other
cluster values
should be commented
- Install the config
kustomize build config | kubectl apply -f -
- Update your context to use the usage-metrics-collector namespace by default
kubectl config set-context --current --namespace=usage-metrics-collector
- Make sure the pods are healthy
kubectl get pods
- Make sure the services have endpoints
kubectl describe services
- Get the metrics from the collector itself
kubectl exec -t -i $(kubectl get pods -o name -l app=metrics-prometheus-collector) -- curl localhost:8080/metrics
- wait for service to be ready
kubectl port-forward service/metrics-prometheus-collector 8080:8080
- visit
localhost:8080/metrics
in your browser
- Get the metrics from prometheus
kubectl port-forward $(kubectl get pods -o name -l app=prometheus) 9090:9090
- visit
localhost:9090/
in your browser
- View the metrics in Grafana
kubectl port-forward service/grafana 3000:3000
- visit
localhost:3000
in your browser - enter
admin
for the username and password - go to "Explore"
- change the source to "prometheus"
- enter
kube_usage_
into the metric field - remove the label filters
- click "Run Query"
- Edit config/metrics-prometheus-collector/configmaps/collector.yaml
- Run
make run-local
- View the updated metrics in grafana
TODO: Write more on this
TODO: Write this
TODO: Write this
Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.