Giter Club home page Giter Club logo

usage-metrics-collector's Introduction

Usage Metrics Collector

The usage-metrics-collector is a Prometheus metrics collector optimized for collecting kube usage and capacity metrics.

Motivation

Why not just use promql and recording rules?

  • Scale
    • Aggregate at collection time to reduce prometheus work
    • Export aggregated metrics without the raw metrics to reduce prometheus storage
  • Insight
    • Join labels at collection time (e.g. set the priority class on pod metrics)
    • Set hard to resolve labels (e.g. set the workload kind on pod metrics)
    • View node-level cgroup utilization (e.g. kubepods vs system.slice metrics)
  • Fidelity
    • Scrape utilization at 1s intervals as raw metrics
    • Perform aggregations on the 1s interval metrics (e.g. get the p95 1s utilization sample for all replicas of a workload)

Exposed Metrics

A sample of the exposed metrics is available in METRICS.md.

In addition to these metrics, a series of performance related metrics are published for the collection process. These metrics are documented in performance analysis document.

Getting started

Note: No usage-metrics-collector container image is publicly hosted. Folks will need to build and publish this own until this is resolved.

Installing into a cluster

Kind cluster

Note: only cgroups v1 are currently supported.

  1. Create a kind cluster
  • kind create cluster
  1. Build the image
  • docker build . -t usage-metrics-collector:v0.0.0
  1. Load the image into kind
  • kind load docker-image usage-metrics-collector:v0.0.0
  1. Make sure the Kind cluster values config portion is uncommented in config/metrics-prometheus-collector/configmaps/sampler.yaml
  2. Install the config
  • kustomize build config | kubectl apply -f -
  1. Update your context to use the usage-metrics-collector namespace by default
  • kubectl config set-context --current --namespace=usage-metrics-collector

GKE cluster (cgroups v1: default on 1.25 or lower)

Note: Only cgroups v1 is supported for utilization right now. GKE clusters 1.26+ use cgroups v2 by default.

  1. Build the image
  • docker build . -t my-org/usage-metrics-collector:v0.0.0
  1. Push the image to a container repo
  • docker push my-org/usage-metrics-collector:v0.0.0
  1. Make sure the GKE cluster values config portion is uncommented in config/metrics-prometheus-collector/configmaps/sampler.yaml
  • Other cluster values should be commented
  1. Install the config
  • kustomize build config | kubectl apply -f -
  1. Update your context to use the usage-metrics-collector namespace by default
  • kubectl config set-context --current --namespace=usage-metrics-collector

Kicking the tires

  1. Make sure the pods are healthy
  • kubectl get pods
  1. Make sure the services have endpoints
  • kubectl describe services
  1. Get the metrics from the collector itself
  • kubectl exec -t -i $(kubectl get pods -o name -l app=metrics-prometheus-collector) -- curl localhost:8080/metrics
  • wait for service to be ready
  • kubectl port-forward service/metrics-prometheus-collector 8080:8080
  • visit localhost:8080/metrics in your browser
  1. Get the metrics from prometheus
  • kubectl port-forward $(kubectl get pods -o name -l app=prometheus) 9090:9090
  • visit localhost:9090/ in your browser
  1. View the metrics in Grafana
  • kubectl port-forward service/grafana 3000:3000
  • visit localhost:3000 in your browser
  • enter admin for the username and password
  • go to "Explore"
  • change the source to "prometheus"
  • enter kube_usage_ into the metric field
  • remove the label filters
  • click "Run Query"

Specifying aggregation rules

  1. Edit config/metrics-prometheus-collector/configmaps/collector.yaml
  2. Run make run-local
  3. View the updated metrics in grafana

TODO: Write more on this

Using containerd instead of cgroup walking

TODO: Write this

Configuring cgroup walking

TODO: Write this

Code of conduct

Participation in the Kubernetes community is governed by the Kubernetes Code of Conduct.

usage-metrics-collector's People

Contributors

campoy avatar ehashman avatar k8s-ci-robot avatar mrbobbytables avatar mrueg avatar ndipebot avatar pmorie avatar pwittrock avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.