prometheus / client_ruby Goto Github PK

View Code? Open in Web Editor NEW

510.0 18.0 149.0 534 KB

Prometheus instrumentation library for Ruby applications

License: Apache License 2.0

Ruby 100.00%

prometheus ruby-client ruby middleware rack rack-middleware prometheus-client-library

client_ruby's Introduction

Prometheus

Visit prometheus.io for the full documentation, examples and guides.

Prometheus, a Cloud Native Computing Foundation project, is a systems and service monitoring system. It collects metrics from configured targets at given intervals, evaluates rule expressions, displays the results, and can trigger alerts when specified conditions are observed.

The features that distinguish Prometheus from other metrics and monitoring systems are:

A multi-dimensional data model (time series defined by metric name and set of key/value dimensions)
PromQL, a powerful and flexible query language to leverage this dimensionality
No dependency on distributed storage; single server nodes are autonomous
An HTTP pull model for time series collection
Pushing time series is supported via an intermediary gateway for batch jobs
Targets are discovered via service discovery or static configuration
Multiple modes of graphing and dashboarding support
Support for hierarchical and horizontal federation

Architecture overview

Install

There are various ways of installing Prometheus.

Precompiled binaries

Precompiled binaries for released versions are available in the download section on prometheus.io. Using the latest production release binary is the recommended way of installing Prometheus. See the Installing chapter in the documentation for all the details.

Docker images

Docker images are available on Quay.io or Docker Hub.

You can launch a Prometheus container for trying it out with

docker run --name prometheus -d -p 127.0.0.1:9090:9090 prom/prometheus

Prometheus will now be reachable at http://localhost:9090/.

Building from source

To build Prometheus from source code, You need:

Start by cloning the repository:

git clone https://github.com/prometheus/prometheus.git
cd prometheus

You can use the go tool to build and install the prometheus and promtool binaries into your GOPATH:

GO111MODULE=on go install github.com/prometheus/prometheus/cmd/...
prometheus --config.file=your_config.yml

However, when using go install to build Prometheus, Prometheus will expect to be able to read its web assets from local filesystem directories under web/ui/static and web/ui/templates. In order for these assets to be found, you will have to run Prometheus from the root of the cloned repository. Note also that these directories do not include the React UI unless it has been built explicitly using make assets or make build.

An example of the above configuration file can be found here.

You can also build using make build, which will compile in the web assets so that Prometheus can be run from anywhere:

make build
./prometheus --config.file=your_config.yml

The Makefile provides several targets:

build: build the prometheus and promtool binaries (includes building and compiling in web assets)
test: run the tests
test-short: run the short tests
format: format the source code
vet: check the source code for common errors
assets: build the React UI

Service discovery plugins

Prometheus is bundled with many service discovery plugins. When building Prometheus from source, you can edit the plugins.yml file to disable some service discoveries. The file is a yaml-formated list of go import path that will be built into the Prometheus binary.

After you have changed the file, you need to run make build again.

If you are using another method to compile Prometheus, make plugins will generate the plugins file accordingly.

If you add out-of-tree plugins, which we do not endorse at the moment, additional steps might be needed to adjust the go.mod and go.sum files. As always, be extra careful when loading third party code.

Building the Docker image

The make docker target is designed for use in our CI system. You can build a docker image locally with the following commands:

make promu
promu crossbuild -p linux/amd64
make npm_licenses
make common-docker-amd64

Using Prometheus as a Go Library

Remote Write

We are publishing our Remote Write protobuf independently at buf.build.

You can use that as a library:

go get buf.build/gen/go/prometheus/prometheus/protocolbuffers/go@latest

This is experimental.

Prometheus code base

In order to comply with go mod rules, Prometheus release number do not exactly match Go module releases. For the Prometheus v2.y.z releases, we are publishing equivalent v0.y.z tags.

Therefore, a user that would want to use Prometheus v2.35.0 as a library could do:

go get github.com/prometheus/[email protected]

This solution makes it clear that we might break our internal Go APIs between minor user-facing releases, as breaking changes are allowed in major version zero.

React UI Development

For more information on building, running, and developing on the React-based UI, see the React app's README.md.

More information

Godoc documentation is available via pkg.go.dev. Due to peculiarities of Go Modules, v2.x.y will be displayed as v0.x.y.
See the Community page for how to reach the Prometheus developers and users on various communication channels.

Contributing

Refer to CONTRIBUTING.md

License

Apache License 2.0, see LICENSE.

client_ruby's People

Contributors

Stargazers

Watchers

Forkers

porras pje kingsleykelly brockspratlen bcandrea krasnoukhov crewton jeffutter vodafon marius nickyp deadtrickster bukalapak tmc betterdoctor mizor mikeurbach gazay zevarito lyda ashokrj colszowka lombold olleolleolle ecraft owensuls dschaub strech edymerchk pawelchcki scan brianebeyer lethjakman moolitayer av-ast envek rohansahai zhanglinjie yaacov nimrodshn mjkim deepthawtz mpalmer adambuckland mrgordon klippx redcanaryco vsh91 danieloliveira079 mailtop josephholsten komuta yarmiganosca mururu kamaradclimber juliancheal dlbock silvermind ediliu13 rumtid shore-gmbh recfive lukesilvia uswitch gocardless criteo-forks xluffy showmax butcher cwndrws anirudh-dasu nguyenductoan lostapathy bluemutedwisdom dmagliola swalberg ganmacs lawrencejones jbernardo95 aweis89 ahmgeek cristiangreco shouichi michaellennox davidkovsky jahzielha stefansundin chrohrer cguess benyitzhaki xiaohuzhou gearnode ultragreen matthieuprat meghasfdc marketcircle jbampton ianks konfortes dropstream

client_ruby's Issues

please add a authentication in exporter

the metrics endpoint is opened and have no authentication.

I use a block to auth this block to solve this, and in my Rails app, it used like this

    config.middleware.use Prometheus::Middleware::Exporter, authentication: ->(env) do
      ActiveSupport::SecurityUtils.secure_compare(
        Rack::Request.new(env).params['secret'].to_s,
        YOUR_SECRET
      )
    end

module Prometheus
  module Middleware
    class Exporter
      attr_reader :app, :registry, :path

      FORMATS  = [Client::Formats::Text].freeze
      FALLBACK = Client::Formats::Text
      DEFAULT_AUTHENTICATION = ->(_) { true }

      def initialize(app, options = {})
        @app = app
        @registry = options[:registry] || Client.registry
        @path = options[:path] || '/metrics'
        @acceptable = build_dictionary(FORMATS, FALLBACK)
        @authentication = options[:authentication] || DEFAULT_AUTHENTICATION
      end

      def call(env)
        if env['PATH_INFO'] == @path
          if !!@authentication.call(env)
            format = negotiate(env, @acceptable)
            format ? respond_with(format) : not_acceptable(FORMATS)
          else
            authentication_failed!
          end
        else
          @app.call(env)
        end
      end

      private

      def authentication_failed!
        [ 401,
          { 'Content-Type' => 'text/plain' },
          ["Authentication Failed"]
        ]
      end
    end
  end
end

Add Child notion to client

Currently labels are handled by being the first argument to all functions. This gives the incorrect impression that all metrics should labels (in reality most metrics don't have labels) and makes label-less use harder.

This client should follow the structure laid out in https://prometheus.io/docs/instrumenting/writing_clientlibs/#labels

In addition the user should be required to specify all their label names at metric creation time.

Resolve discussion on Ruby version support and set up CI matrix for all supported versions

I'm pretty strongly against supporting anything below Ruby 2.1 because the lack of required keyword arguments is a pain to work around (you can, with sentinel values, but it's a mess). It's been out of support for a long time now that I don't think it justifies the ongoing effort and risk of bugs in our workarounds.

This also raises the discussion of what our Ruby version support policy should be overall. I think it's good to document this up front, to set expectations around how we'll act as maintainers - something we've done before on our own open source projects.

Tangientially, some of the memory optimisations people have been playing with here involve methods introduced to Ruby's stdlib in relatively new versions (one of them was added in 2.5). If they're really worth having, we might be looking at some code that conditionally runs in those versions.

Once we decide what we're doing, we should translate that to the CI matrix of Ruby versions that we run our tests against.

How to push Collector metrics to Pushgateway

I would like to push the metrics collected by the Rake Collector middleware onto a Pushgateway but I'm not sure how to accomplish this.

Remove flexible label support from Rack middleware

Supporting flexible labels in our out-the-box Rack middleware commits us to maintaining what is currently quite a confusing API.

@dmagliola discussed a few ways to make the API less weird, but they always resulted in the middleware accepting multiple lambdas for custom behaviour as arguments and having almost no behaviour provided out the box - sort of defeating the purpose!

We can always come back and add a better implementation of this functionality later, but it will be a pain to take this version of it away.

I’m in favour of only supporting our fixed set of labels in the Rack middleware we provide, and having a README section advising people to do their own thing if they want something more sophisticated.

Collector or Exporter outputs fluctuating data

I am using client_ruby in a rails application with the following config.ru file:

# This file is used by Rack-based servers to start the application.

require ::File.expand_path('../config/environment',  __FILE__)

# gzip compression
use Rack::Deflater

# metrics
require 'prometheus/client/rack/collector'
require 'prometheus/client/rack/exporter'

use Prometheus::Client::Rack::Collector
use Prometheus::Client::Rack::Exporter

run Storybook::Application

When analyzing counters such as http_requests_total or http_request_duration_total_seconds, I noticed that the values will fluctuate back and forth every few seconds. I confirmed this by constantly refreshing my application.com/metrics page and observing the values. My grafana dashboard caught this instantly.

http_requests_total exhibits similar behaviour.

Are these fluctuations expected behaviour?

Metric uniqueness check uses only metric name.

In go client hash of labels and metric name is used to check uniqueness.
https://github.com/prometheus/client_golang/blob/master/prometheus/desc.go#L71

In Ruby client it's not the case and only metric name is used to identify unique metrics.

Consider exporting unaggregated per-pid metrics by default

We had a couple of comments on this point, and it’s fair. The point about “no obvious right way to aggregate gauges in all cases” is a solid argument.

We need to decide a stance on this both for gauges specifically and for metrics in general (i.e. should gauges be a special case where we don't aggregate by default).

This issue should be renamed and updated once we decide that stance.

Histogram is really a cumulative histogram

In contrast to other prometheus clients (i.e. golang) the histogram does not use disjoint buckets but cumulative values (see https://github.com/prometheus/client_ruby/blob/master/lib/prometheus/client/histogram.rb#L28)

While this is also a nice way to collect metrics, it should be named as CumulativeHistogram and Histogram should behave as other client libraries do, as this can be confusing, especially when using quantile conversion.

Pre-release 0.7.0

Hey guys, your library is awesome!

Can I ask you for a favour to release rc1 or so for 0.7.0 version? Unfortunately our project can't use the master branch from the Github. But in current the master, you have super exciting and important changes.

Review approach to file globbing when reading metrics

One of our internal users raised a point about our method of reading PID files when exporting metrics making it possible to accidentally include more files than it should. Specifically, if one metric name is a subset of another, the export of the metric with the subset name could include values from the metric with a longer name.

@dmagliola commented that running into this issue would involve putting a triple underscore in your metric name, which is highly unusual and against conventions, but maybe we can choose a character that never appears in metric names when we generate the file names.

At a minimum, if we make no code change, we should document the behaviour.

Use histogram in rack collector.

Seeing histogram support land is great. Can the Rack collector be updated to export it?

Metric name validation doesn't match docs

Docs say that the metric name should match [a-zA-Z_:][a-zA-Z0-9_:]*: https://prometheus.io/docs/concepts/data_model/#metric-names-and-labels

The Ruby library just checks that it is a symbol: https://github.com/prometheus/client_ruby/blob/master/lib/prometheus/client/metric.rb#L49

Correct way to instantiate Registries

I am trying to use the promethus client in a Rack application using Grape.

Since Grape APIs do not have an initialize block, I am somewhat lost as to what the best place is to create the Registry and register metrics.

Am I meant to only ever use a single Client::Registry throughout my code or can I create new ones where I need them?
Will they share the registered metrics?

I suspect not, and if that is correct, I would appreciate some pointers as to the preferred way of handling the Registry. Should I wrap it in a Singleton?

show usage of custom label_builder in rack collector

Histogram buckets missing the _bucket suffix

When exposing buckets, the suffix is missing.

Push uses legacy API

The API that this client uses to push metrics to a push gateway was deprecated and then removed in this commit. Push from this client to newer versions of the push gateway fail with a 404 Not Found error.

Support for Custom Collectors

It'd be great to have support for custom collectors in the ruby gem. I see them in both python and java.

Any chance of implementation here?

Server won't process data

I've setup a custom exporter and prometheus is querying correctly but its won't store data. When i'am manually query them all seems fine.
Any ideas?

root@stats >>> curl odroid:5000/metrics
# TYPE sma_dc_power_kw gauge
# HELP sma_dc_power_kw DC Power
sma_dc_power_kw{phase="1"} 0.0
sma_dc_power_kw{phase="2"} 0.0
# TYPE sma_dc_voltage gauge
# HELP sma_dc_voltage DC Voltage
sma_dc_voltage{phase="1"} 0.0
sma_dc_voltage{phase="2"} 0.0
# TYPE sma_dc_current gauge
# HELP sma_dc_current DC Current
sma_dc_current{phase="1"} 0.0
sma_dc_current{phase="2"} 0.0
# TYPE sma_ac_power_kw gauge
# HELP sma_ac_power_kw AC Power
sma_ac_power_kw{phase="1"} 0.0
sma_ac_power_kw{phase="2"} 0.0
sma_ac_power_kw{phase="3"} 0.0
# TYPE sma_ac_current gauge
# HELP sma_ac_current AC Current
sma_ac_current{phase="1"} 0.0
sma_ac_current{phase="2"} 0.0
sma_ac_current{phase="3"} 0.0
# TYPE sma_ac_voltage gauge
# HELP sma_ac_voltage AC Voltage
sma_ac_voltage{phase="1"} 0.0
sma_ac_voltage{phase="2"} 0.0
sma_ac_voltage{phase="3"} 0.0
# TYPE sma_device_temperature gauge
# HELP sma_device_temperature SMA device temperature
sma_device_temperature 0.0
# TYPE sma_device_state gauge
# HELP sma_device_state SMA device state
sma_device_state 
# TYPE sma_device_sn gauge
# HELP sma_device_sn SMA device serialnumber
sma_device_sn 
# TYPE sma_grid_state gauge
# HELP sma_grid_state SMA grid state
sma_grid_state 
# TYPE sma_grid_freq gauge
# HELP sma_grid_freq SMA grid state
sma_grid_freq 
# TYPE http_requests_total counter
# HELP http_requests_total A counter of the total number of HTTP requests made.
http_requests_total{method="get",host="localhost:5000",path="/metrics",code="200"} 393698
http_requests_total{method="get",host="odroid:5000",path="/",code="200"} 1
http_requests_total{method="get",host="odroid:5000",path="/metrics",code="200"} 57735
# TYPE http_request_duration_seconds summary
# HELP http_request_duration_seconds A histogram of the response latency.
http_request_duration_seconds{method="get",host="localhost:5000",path="/metrics",code="200",quantile="0.5"} 0.011047742
http_request_duration_seconds{method="get",host="localhost:5000",path="/metrics",code="200",quantile="0.9"} 0.017268368
http_request_duration_seconds{method="get",host="localhost:5000",path="/metrics",code="200",quantile="0.99"} 0.029610358
http_request_duration_seconds_sum{method="get",host="localhost:5000",path="/metrics",code="200"} 4950.962471179024
http_request_duration_seconds_count{method="get",host="localhost:5000",path="/metrics",code="200"} 393698
http_request_duration_seconds{method="get",host="odroid:5000",path="/",code="200",quantile="0.5"} 6.2291e-05
http_request_duration_seconds{method="get",host="odroid:5000",path="/",code="200",quantile="0.9"} 6.2291e-05
http_request_duration_seconds{method="get",host="odroid:5000",path="/",code="200",quantile="0.99"} 6.2291e-05
http_request_duration_seconds_sum{method="get",host="odroid:5000",path="/",code="200"} 6.2291e-05
http_request_duration_seconds_count{method="get",host="odroid:5000",path="/",code="200"} 1
http_request_duration_seconds{method="get",host="odroid:5000",path="/metrics",code="200",quantile="0.5"} 0.01191175
http_request_duration_seconds{method="get",host="odroid:5000",path="/metrics",code="200",quantile="0.9"} 0.012619823
http_request_duration_seconds{method="get",host="odroid:5000",path="/metrics",code="200",quantile="0.99"} 0.012885253
http_request_duration_seconds_sum{method="get",host="odroid:5000",path="/metrics",code="200"} 719.6265259969975
http_request_duration_seconds_count{method="get",host="odroid:5000",path="/metrics",code="200"} 57735
# TYPE http_exceptions_total counter
# HELP http_exceptions_total A counter of the total number of exceptions raised.

Right time to Pushgateway with background job processing like Sidekiq?

I wonder which is the right time to push metrics in a background job system like Sidekiq, seems too much to do it after each job completes/fail, a cronjob or scheduled task is better? how often?

I have the understanding that Sidekiq uses threads so registry will be available for every job as long the process is running, but I wonder what happen with scheduled jobs, perhaps this question is more for Sidekiq but some of you might be already answered yourself this.

Thanks!

Missing protobuf support

There is currently no support for prometheus' protobuf format. The format is described here: http://prometheus.io/docs/instrumenting/exposition_formats/.

The protobuf definition itself has been created already: https://github.com/prometheus/client_model/tree/master/ruby.

README.md for stable version

I think it would be nice to have the README.md showing instructions to the most recent stable version.
If I go to github.com/prometheus/client_ruby and look at the README.md, those are not the instructions for the most recent stable version.

delete or remove method in ruby-client

I did not find delete or remove method in ruby-client, like gauge:
gauge.set({ room: 'kitchen' }, 21.534)
gauge.get({ room: 'kitchen' })
Is there any gauge.delete or gauge.remove method? I did not find this in the source code also.

Add process collector

See https://github.com/prometheus/client_golang/blob/master/prometheus/process_collector.go for the go implementation.

Remove Gemnasium shameless plug

Since Gemnasium no longer works after selling out, their badge should probably be removed from the README.

How to make push work?

I have problems with pushes.
curl to pushgate works, but this does not:

require 'prometheus/client'
require 'prometheus/client/push'
prometheus = Prometheus::Client.registry
counter = Prometheus::Client::Counter.new(:something_here, 'hello world')
counter.increment({ service: 'foo' })
counter.increment({ service: 'foo' })
counter.increment({ service: 'foo' })
counter.increment({ service: 'foo' })
Prometheus::Client::Push.new('job-1', nil, 'http://my-pushgate:9091').add(prometheus)

I don't see something_here counter in http://my-pushgate:9091/metrics. Only push_time_seconds

What am I doing wrong?

Sliding window implementation for summaries

So in most cases with time series data, it doesnt make sense to have cumulative summaries. I've implemented sliding windows in a way that works for my uses, but I thought I'd put it here in case you'd like to modify it slightly for the full gem

TimeWindowEstimator class

# This class is a wrapper around a single Quantile::Estimator, which is used to hold data for summaries
# It maintains a ring buffer of Estimators to provide quantiles over a sliding windows of time.
module Quantile
  class TimeWindowEstimator
    attr_reader :invariants

    def initialize(invariants, max_age_seconds, age_buckets)
      @invariants = invariants
      @ring_buffer = []
      age_buckets.times { @ring_buffer.push(Estimator.new(*invariants)) }
      @current_bucket = 0
      @last_rotated_time = Time.now
      @duration_between_rotations_seconds = max_age_seconds / age_buckets
    end

    def query(quantile)
      current_estimator = rotate
      current_estimator.query(quantile)
    end

    def observe(value)
      rotate
      @ring_buffer.each do |estimator|
        estimator.observe(value)
      end
    end

    def sum
      current_estimator = rotate
      current_estimator.sum
    end

    def observations
      current_estimator = rotate
      current_estimator.observations
    end

    private

    def rotate
      time_since_last_rotate = Time.now - @last_rotated_time
      while time_since_last_rotate > @duration_between_rotations_seconds
        # Clear the current bucket
        @ring_buffer[@current_bucket] = Estimator.new(*@invariants)

        @current_bucket += 1
        @current_bucket = 0 if @current_bucket >= @ring_buffer.length

        time_since_last_rotate -= @duration_between_rotations_seconds
        @last_rotated_time += @duration_between_rotations_seconds
      end
      @ring_buffer[@current_bucket]
    end
  end
end

TimeWindowSummary

require 'quantile'
require 'prometheus/client/summary'
require_relative 'time_window_estimator'

# rubocop:disable Metrics/LineLength
# rubocop:disable Metrics/ParameterLists

module Prometheus
  module Client
    # Summary is an accumulator for samples. It captures Numeric data and provides an efficient quantile calculation mechanism.
    # TimeWindowSummary is a summary but with a sliding window of time for metrics
    class TimeWindowSummary < Summary
      attr_reader :invariants, :max_age_seconds, :num_buckets

      # Time Window Summaries should have:
      # name: name of the metric
      # docstring: description of the metric
      # max_age_seconds: Set the duration of the time window is, i.e. how long observations are kept before they are discarded
      # invariants: Array of quantiles and their given error bounds
      # num_buckets: Set the number of buckets used to implement the sliding time window. If your time window is 10 minutes,
      #     and you have ageBuckets=5, buckets will be switched every 2 minutes.
      #     The value is a trade-off between resources (memory and cpu for maintaining the bucket) and how smooth the time window is moved.
      # base_labels: optional set of labels
      def initialize(name, docstring, invariants, max_age_seconds, num_buckets, base_labels = {})
        @invariants = invariants
        @max_age_seconds = max_age_seconds
        @num_buckets = num_buckets

        super(name, docstring, base_labels)
      end

      # Type must be summary so that the prometheus scraper still thinks its a valid type
      def type
        :summary
      end

      private

      def default
        Quantile::TimeWindowEstimator.new(@invariants, @max_age_seconds, @num_buckets)
      end
    end
  end
end
# rubocop:enable Metrics/LineLength
# rubocop:enable Metrics/ParameterLists

/metrics returning JSON

[
  {
    "baseLabels": {
      "__name__": "http_requests_total"
    },
    "docstring": "A counter of the total number of HTTP requests made.",
    "metric": {
      "type": "counter",
      "value": [
        {
          "labels": {
            "method": "get",
            "path": "/metrics",
            "code": "200"
          },
          "value": 6
        },
        {
          "labels": {
            "method": "get",
            "path": "/",
            "code": "200"
          },
          "value": 11
        }
      ]
    }
  },
  {
    "baseLabels": {
      "__name__": "http_request_durations_total_microseconds"
    },
    "docstring": "The total amount of time spent answering HTTP requests (microseconds).",
    "metric": {
      "type": "counter",
      "value": [
        {
          "labels": {
            "method": "get",
            "path": "/metrics",
            "code": "200"
          },
          "value": 4057
        },
        {
          "labels": {
            "method": "get",
            "path": "/",
            "code": "200"
          },
          "value": 27355
        }
      ]
    }
  },
  {
    "baseLabels": {
      "__name__": "http_request_durations_microseconds"
    },
    "docstring": "A histogram of the response latency (microseconds).",
    "metric": {
      "type": "histogram",
      "value": [
        {
          "labels": {
            "method": "get",
            "path": "/metrics",
            "code": "200"
          },
          "value": {
            "0.5": 609,
            "0.9": 652,
            "0.99": 652
          }
        },
        {
          "labels": {
            "method": "get",
            "path": "/",
            "code": "200"
          },
          "value": {
            "0.5": 1492,
            "0.9": 1628,
            "0.99": 1628
          }
        }
      ]
    }
  },
  {
    "baseLabels": {
      "__name__": "http_exceptions_total"
    },
    "docstring": "A counter of the total number of exceptions raised.",
    "metric": {
      "type": "counter",
      "value": []
    }
  }
]

I've included Prometheus rack middleware in a new Rails app and /metrics returns JSON. Is this normal/expected?

Counters can be decremented

The prometheus documentation states that:

Counters should not be used to expose current counts of items whose number can also go down, e.g. the number of currently running goroutines.

Currently counters can be decremented by decrement or passing a negative value into increment.

Summary Metric Calculates Percentiles Strangely

Using Ruby 2.4.1, prometheus-client 0.7.1, and quantile 0.2.0 (for specificity), the Summary class is generating strange percentiles (or what I assume are supposed to be percentiles).

➜ bundle exec pry
[1] pry(main)> require 'prometheus/client/summary'
=> true
[2] pry(main)> summary = Prometheus::Client::Summary.new(:a, "a")
=> #<Prometheus::Client::Summary:0x00000003041908
 @base_labels={},
 @docstring="a",
 @mutex=#<Thread::Mutex:0x00000003041890>,
 @name=:a,
 @validator=#<Prometheus::Client::LabelSetValidator:0x00000003041840 @validated={}>,
 @values={}>
[3] pry(main)> (1..100_000).each { |n| summary.observe({}, n) }; summary.get
=> {0.5=>27253, 0.9=>44736, 0.99=>49532}

This is something the quantile gem is doing, not something Prometheus::Client::Summary is doing, as the values returned are identical to the ones provided by the underlying library

➜ bundle exec pry
[1] pry(main)> require 'quantile'
=> true
[2] pry(main)> qe = Quantile::Estimator.new
=> #<Quantile::Estimator:0x0000000238aea8
 @buffer=[],
 @head=nil,
 @invariants=
  [#<Quantile::Quantile:0x0000000238ae58 @coefficient_i=0.2, @coefficient_ii=0.2, @inaccuracy=0.05, @quantile=0.5>,
   #<Quantile::Quantile:0x0000000238ae30 @coefficient_i=0.20000000000000004, @coefficient_ii=0.022222222222222223, @inaccuracy=0.01, @quantile=0.9>,
   #<Quantile::Quantile:0x0000000238ae08 @coefficient_i=0.19999999999999982, @coefficient_ii=0.00202020202020202, @inaccuracy=0.001, @quantile=0.99>],
 @observations=0,
 @sum=0>
[3] pry(main)> (1..100_000).each(&qe.method(:observe)); nil
=> nil
[4] pry(main)> qe.query(0.5)
=> 27253
[5] pry(main)> qe.query(0.9)
=> 44736
[6] pry(main)> qe.query(0.99)
=> 49532

But it looks like the Java client library has this exact (well, almost) setup as an automated test, and they assert the values are pretty much normal percentiles, but with an error margin:
https://github.com/prometheus/client_java/blob/master/simpleclient/src/test/java/io/prometheus/client/SummaryTest.java#L72-L89

  @Test
  public void testQuantiles() {
    int nSamples = 1000000; // simulate one million samples

    for (int i=1; i<=nSamples; i++) {
      // In this test, we observe the numbers from 1 to nSamples,
      // because that makes it easy to verify if the quantiles are correct.
      labelsAndQuantiles.labels("a").observe(i);
      noLabelsAndQuantiles.observe(i);
    }
    assertEquals(getNoLabelQuantile(0.5), 0.5 * nSamples, 0.05 * nSamples);
    assertEquals(getNoLabelQuantile(0.9), 0.9 * nSamples, 0.01 * nSamples);
    assertEquals(getNoLabelQuantile(0.99), 0.99 * nSamples, 0.001 * nSamples);

    assertEquals(getLabeledQuantile("a", 0.5), 0.5 * nSamples, 0.05 * nSamples);
    assertEquals(getLabeledQuantile("a", 0.9), 0.9 * nSamples, 0.01 * nSamples);
    assertEquals(getLabeledQuantile("a", 0.99), 0.99 * nSamples, 0.001 * nSamples);
  }

Those assertions seem to indicate what I thought would happen, which is the percentiles (under uniform distribution) will be linear with the number of observations.

So, which behavior is correct?

Full example code for historgram/summary?

I am newbie with prometheus. I wanted to test histogram & summary with pushgateway.
Readme was not helpful for me to setup basic histogram/summary metrics.
I think it will be very helpful to add basic example to examples directory.

Validate behaviour of `DirectFileStore` in the presence of existing files

We (GoCardless) run our services in containers, which means a clean file system every time we boot the app.

We should look at what the behaviour is like for people who have file systems that persist between versions of the app. If so, we should look at what mitigations we can implement to make DirectFileStore work by default.

Any edge-cases should be added to the DirectFileStore’s docs.

Rack Collector too much data?

Hi,

I was playing with this lib yesterday, mostly to be able to provide metrics about certain aspects of the application but I've also tried Collector as well. Collector provides the same metrics that you can get with Nginx-Lua so I'll stay with it since it is generic solution for every HTTP app and I think it is little bit less overhead being collected with Lua from Nginx side.

However the question I would like to do is about knowing how you guys are dealing with the collector tracking URI's since it collect a bunch of data that makes scraper takes a lot of seconds to complete. For example in Staging with ~3 people accessing the application the scrape time is about 20s I cannot imagine how much it could be in Production. Easy fix is don't track URI but you loose the ability to identify slow endpoints, of-course is possible to use Logs or implement something else to get the slowest requests but will be really nice to keep that data in Prometheus.

Can you please share thoughts and experiences about Collector?

Add Histogram support

See https://prometheus.io/docs/concepts/metric_types/#histogram

Metrics without `path`

Hello.

I'm using current master and trying to get rid of the path parameter in the metrics.

If I do this

use Prometheus::Middleware::Collector, counter_label_builder: ->(env, code) {
  {
    method: env['REQUEST_METHOD'].downcase,
    code: code
  }
}

then http_server_request_duration_seconds_bucket metric gets path parameter and graph which shows percentiles becomes suuuuper slow. However with such setup http_server_requests_total metric doesn't have path, so graph which shows request per seconds works ok.

If I change to duration_label_builder, then it works vice versa: http_server_requests_total gets path and http_server_request_duration_seconds_bucket does not have it.

I'm confused how to remove path from both metrics, because having path in any of these metrics leads to super slow graphs. As I understand having path in metrics makes in slow to squash all these metrics with different path when you group by something like method, for example for requests-per-second graphs I have the following query:

sum(rate(http_server_request_duration_seconds_count{job="myjob"}[1m])) by(method)

And if there are each http_server_request_duration_seconds_count per path, it becomes slow.

Thanks.

Document that `DirectFileStore` is more memory-hungry than other stores

From running it in production, we (GoCardless) have found DirectFileStore to be more memory-hungry than other ways of storing metrics.

This isn't entirely surprising, but we've done a little investigation into whether we could reduce the effect and we may have some improvements we can make.

One mitigation we found after looking round the internet was to switch from libc malloc to jemalloc, which mitigates a lot of the memory bloat issues you can run into with CRuby.

For now it's sufficient for us to document this and move on.

One of our internal users found some potential savings on memory allocations (hence bloat), which we can look to apply later, but which don't block releasing multi-process support.

Add 'Upgrading' section to README for people upgrading from versions predating multi-process support

As part of our work in #95 to introduce multi-process support, we made several breaking changes to the interface of the library.

To ease the transition for existing users, we should provide some documentation on the changes.

Missing summary decay

There is currently no decay of summary observations. This might lead to wrong quantile metrics in low throughput values due to stale values skewing results.

New release to get #95?

#95 was a significant re-write. I understand that people want to do more work before a "proper" 1.0 release, but it'd be nice to have a rubygems release of current master that we can point at instead of pointing at the git repository. Maybe 0.10.0 or 1.0.0-alpha.1 (ie a prerelease version) or something?

For context, we have been using the gocardless fork for a while and in alphagov/verify-frontend#697 we switched to the official master branch.

tenderlove/mmap doesn't work with modern ruby

Currently tracking this in tenderlove/mmap#6. The issue is that the gemspec is broken for modern ruby.

Support pre-fork servers

If you use this gem with a multi-process Rack server such as Unicorn, surely each worker will be returning just a percentage of the correct results (eg., number of requests served, total time), thus making the exposed metrics fairly meaningless?

To solve this the easiest solution is to create a block of shared memory in the master process that all workers share, instead of using instance variables.

Support metric timestamps

Hi,

I'm using client_ruby to create/set counter and gauge metrics but i don't see in the prometheus library an option to specify a timestamp.

Is this not possibly?

Can't get custom metrics to work in Rails App

Hi!

I've configured the gem in a Rails App and default metrics seem to be working fine. However, If I try to do something like this:

  def index
    gauge = Prometheus::Client::Gauge.new(:room_temperature_celsius, '...')
    gauge.set({ room: 'kitchen' }, 21.534)

    result = User.find_by(username: params[:username])
    if result.nil?
      render json: { msg: 'Error user name not found' }, status: :not_found
    else
      render json: result
    end
  end

The metric does not show in /metrics path.

Is there any I'm doing wrong?
Thank you!

Release v0.7.2?

Would it be possible to release the latest master since it's been sitting for 4 months?

Thanks.

[RFC] Align Ruby client with Prometheus best practices, prepare for multi-process support

As it currently stands, the Prometheus Ruby Client has a few issues that make it hard to adopt in mainstream Ruby projects, particularly in Web applications:

Pre-fork servers can't report metrics, because each process has their own set of data, and what gets reported to Prometheus depends on which process responds to the scrape request.
The current Client, being one of the first clients created, doesn't follow several of the Best Practices and Guidelines.

We'd like to contribute to the effort of improving the client in these directions, and we're proposing we could make a number of changes (this issue will be promptly followed by a PR that implements several of these).

Objectives

Follow client conventions and best practices
Add the notion of Pluggable backends. Client should be configurable with different backends: thread-safe (default), thread-unsafe (lock-free for performance on single-threaded cases), multiprocess, etc.
- Consumers should be able to build and plug their own backends based on their use cases.

We have reached out to @grobie recently and he suggested that releasing a new major version was the way to go in order to work around these issues.

There are several proposals in this RFC for improvements to the existing Prometheus Client for Ruby.

These proposals are largely independent of each other, so we can pick for each one whether we think it’s an improvement or not. They are also ordered from most to least important. Only the first one is an absolute must, since it paves the way for adding multi-process support.

1. Centralizing and Abstracting storage of Metric values

In the current client, each Metric object has an internal @values hash to store the metric values. The value of this hash is, for Gauges and Counters, a float, and the key of this hash is itself a hash of labels and their values. Thus, for one given metric there are multiple values at any given time, one for each combination of values of their labels.

For Histograms, @values doesn’t hold a float. Instead it holds a Histogram::Value object, which holds one integer for each bucket, plus the total number of observations, and the sum of all observations. Summaries do a similar thing.

We're proposing that, instead of each metric holding their own counter internally, we should have a centralized store that holds all the information. Metric objects update this store for every observation, and it gets read in its entirety by a formatter when the metrics are scraped.

Why

Having this central storage allows us to abstract how that data is stored internally. For simpler cases, we can simply use a large hash, similar to the current implementation. But other situations (like pre-fork servers) require more involved implementations, to be able to share memory between different processes and report coherent total numbers. Abstracting the storage away allows the rest of the client to be agnostic about this complexity, and allows for multiple different “backends” that can be swapped based on the needs of each particular user.

What this looks like:

Prometheus would have a global config object that allows users to set which Store they want:

module Prometheus
  module Client
    class Config
      attr_accessor :data_store
    
      def initialize
        @data_store = DataStores::SimpleHash.new
      end
    end
  end
end

As a default, a simple storage system that provides the same functionality as the current client is provided. Other backends may be provided with the gem, and users can also make their own. Note that we set the data store to an instantiated object, not a class. This is because that object may need store-specific parameters when being instantiated (file paths, connection strings, etc)

These swappable stores have the following interface:

module Prometheus
  module Client
    module DataStores
      class ExampleCustomStore
      
        # Return a MetricStore, which provides a view of the internal data store, 
        # catering specifically to that metric.
        #
        # - `metric_settings` specifies configuration parameters for this metric 
        #   specifically. These may or may not be necessary, depending on each specific
        #   store and metric. The most obvious example of this is for gauges in 
        #   multi-process environments, where the developer needs to choose how those 
        #   gauges will get aggregated between all the per-process values.
        # 
        #   The settings that the store will accept, and what it will do with them, are
        #   100% Store-specific. Each store should document what settings it will accept
        #   and how to use them, so the developer using that store can pass the appropriate 
        #   instantiating the Store itself, and the Metrics they declare.
        #
        # - `metric_type` is specified in case a store wants to validate that the settings
        #   are valid for the metric being set up. It may go unused by most Stores
        #
        # Even if your store doesn't need these two parameters, the Store must expose them
        # to make them swappable.   
        def for_metric(metric_name, metric_type:, metric_settings: {})
          # Generally, here a Store would validate that the settings passed in are valid,
          # and raise if they aren't.
          validate_metric_settings(metric_type: metric_type, 
                                   metric_settings: metric_settings)
          MetricStore.new(store: self, 
                          metric_name: metric_name, 
                          metric_type: metric_type, 
                          metric_settings: metric_settings)
        end

        
        # MetricStore manages the data for one specific metric. It's generally a view onto
        # the central store shared by all metrics, but it could also hold the data itself
        # if that's better for the specific scenario 
        class MetricStore
          # This constructor is internal to this store, so the signature doesn't need to
          # be this. No one other than the Store should be creating MetricStores 
          def initialize(store:, metric_name:, metric_type:, metric_settings:)
          end

          # Metrics may need to modify multiple values at once (Histograms do this, for 
          # example). MetricStore needs to provide a way to synchronize those, in addition
          # to all of the value modifications being thread-safe without a need for simple 
          # Metrics to call `synchronize`
          def synchronize
            raise NotImplementedError
          end


          # Store a value for this metric and a set of labels
          # Internally, may add extra "labels" to disambiguate values between,
          # for example, different processes
          def set(labels:, val:)
            raise NotImplementedError
          end

          def increment(labels:, by: 1)
            raise NotImplementedError
          end
  
          # Return a value for a set of labels
          # Will return the same value stored by `set`, as opposed to `all_values`, which 
          # may aggregate multiple values.
          #
          # For example, in a multi-process scenario, `set` may add an extra internal
          # label tagging the value with the process id. `get` will return the value for
          # "this" process ID. `all_values` will return an aggregated value for all 
          # process IDs.
          def get(labels:)
            raise NotImplementedError
          end
  
          # Returns all the sets of labels seen by the Store, and the aggregated value for 
          # each.
          # 
          # In some cases, this is just a matter of returning the stored value.
          # 
          # In other cases, the store may need to aggregate multiple values for the same
          # set of labels. For example, in a multiple process it may need to `sum` the
          # values of counters from each process. Or for `gauges`, it may need to take the
          # `max`. This is generally specified in `metric_settings` when calling 
          # `Store#for_metric`.
          def all_values
            raise NotImplementedError
          end
        end
      end
    end
  end
end

For example, the default implementation of this interface would be something like this: (like all the code in this doc, this is simplified to explain the general idea, it is not final code):

module Prometheus
  module Client
    module DataStores
      # Stores all the data in a simple, synchronized global Hash
      #
      # There are ways of making this faster (because of the naive Mutex usage).
      class SimpleHash
        def initialize
          @internal_store = Hash.new { |hash, key| hash[key] = 0.0 }
        end

        def for_metric(metric_name, metric_type:, metric_settings: {})
          # We don't need `metric_type` or `metric_settings` for this particular store
          MetricStore.new(store: self, metric_name: metric_name)
        end

        private

        class MetricStore
          def initialize(store:, metric_name:)
            @store = store
            @internal_store = store.internal_store
            @metric_name = metric_name
          end

          def synchronize
            @store.synchronize { yield }
          end

          def set(labels:, val:)
            synchronize do
              @internal_store[store_key(labels)] = val.to_f
            end
          end

          def increment(labels:, by: 1)
            synchronize do
              @internal_store[store_key(labels)] += by
            end
          end

          def get(labels:)
            synchronize do
              @internal_store[store_key(labels)]
            end
          end

          def all_values
            store_copy = synchronize { @internal_store.dup }

            store_copy.each_with_object({}) do |(labels, v), acc|
              if labels["__metric_name"] == @metric_name
                label_set = labels.reject { |k,_| k == "__metric_name" }
                acc[label_set] = v
              end
            end
          end

          private

          def store_key(labels)
            labels.merge(
              { "__metric_name" => @metric_name }
            )
          end
        end
      end
    end
  end
end

A more complex store may look like this: (note, this is based on a fantasy primitive that magically shares memory between processes, it’s just to illustrate how extra internal labels / aggregators work):

module Prometheus
  module Client
    module DataStores
      # Stores all the data in a magic data structure that keeps cross-process data, in a
      # way that all processes can read it, but each can write only to their own set of
      # keys.
      # It doesn't care how that works, this is not an actual solution to anything,
      # just an example of how the interface would work with something like that.
      #
      # Metric Settings have one possible key, `aggregation`, which must be one of
      # `AGGREGATION_MODES`
      class SampleMagicMultiprocessStore
        AGGREGATION_MODES = [MAX = :max, MIN = :min, SUM = :sum, AVG = :avg]
        DEFAULT_METRIC_SETTINGS = { aggregation: SUM }

        def initialize
          @internal_store = MagicHashSharedBetweenProcesses.new # PStore, for example
        end

        def for_metric(metric_name, metric_type:, metric_settings: {})
          settings = DEFAULT_METRIC_SETTINGS.merge(metric_settings)
          validate_metric_settings(metric_settings: settings)
          MetricStore.new(store: self,
                          metric_name: metric_name,
                          metric_type: metric_type,
                          metric_settings: settings)
        end

        private

        def validate_metric_settings(metric_settings:)
          raise unless metric_settings.has_key?(:aggregation)
          raise unless metric_settings[:aggregation].in?(AGGREGATION_MODES)
        end

        class MetricStore
          def initialize(store:, metric_name:, metric_type:, metric_settings:)
            @store = store
            @internal_store = store.internal_store
            @metric_name = metric_name
            @aggregation_mode = metric_settings[:aggregation]
          end

          def set(labels:, val:)
            @internal_store[store_key(labels)] = val.to_f
          end

          def get(labels:)
            @internal_store[store_key(labels)]
          end

          def all_values
            non_aggregated_values = all_store_values.each_with_object({}) do |(labels, v), acc|
              if labels["__metric_name"] == @metric_name
                label_set = labels.reject { |k,_| k.in?("__metric_name", "__pid") }
                acc[label_set] ||= []
                acc[label_set] << v
              end
            end

            # Aggregate all the different values for each label_set
            non_aggregated_values.each_with_object({}) do |(label_set, values), acc|
              acc[label_set] = aggregate(values)
            end
          end

          private

          def all_store_values
            # This assumes there's a something common that all processes can write to, and
            # it's magically synchronized (which is not true of a PStore, for example, but
            # would of some sort of external data store like Redis, Memcached, SQLite)

            # This could also have some sort of:
            #    file_list = Dir.glob(File.join(path, '*.db')).sort
            # which reads all the PStore files / MMapped files, etc, and returns a hash
            # with all of them together, which then `values` and `label_sets` can use
          end

          # This method holds most of the key to how this Store works. Adding `_pid` as
          # one of the labels, we hold each process's value separately, which we can 
          # aggregate later 
          def store_key(labels)
            labels.merge(
              {
                "__metric_name" => @metric_name,
                "__pid" => Process.pid
              }
            )
          end

          def aggregate(values)
            # This is a horrible way to do this, just illustrating the point
            values.send(@aggregation_mode)
          end
        end
      end
    end
  end
end

The way you’d use these stores and aggregators would be something like:

Client.config.data_store = DataStores::SampleMagicMultiprocessStore.new(dir: "/tmp/prom")

Client.registry.count(
  :http_requests_total,
  docstring: 'Number of HTTP requests'
)

Client.registry.gauge(
  :max_memory_in_a_process,
  docstring: 'Maximum memory consumed by one process',
  store_settings: { aggregation: DataStores::SampleMagicMultiprocessStore::MAX }
)

For all other metrics, you’d just get sum by default which is probably what you want.

Stores are ultimately only used by the Metrics and the Formatters. The user never touches them.
The way Metrics work is similar to this:

class Metric
  def initialize(name,
                 docstring:,
                 store_settings: {})

    @store = Prometheus::Client.config.data_store.for_metric(name, metric_type: type, metric_settings: store_settings)
  end

  def get(labels: {})
    @store.get(labels: label_set_for(labels))
  end

  def values
    @store.all_values
  end
end

class Counter < Metric
  def type
    :counter
  end

  def increment(by: 1, labels: {})
    @store.increment(labels: label_set_for(labels), by: by) 
  end
end

Storage for Histograms and Summaries

In the current client, Histograms use a special value object to hold the number of observations for each bucket, plus a total and a sum. Our stores don’t allow this, since they’re a simple Hash that stores floats and nothing else.

To work around this, Histograms add special, reserved labels when interacting with the store. These are the same labels that’ll be exposed when exporting the metrics, so there isn’t a huge impedance problem with doing this. The main difference is that Histograms need to override the get and values methods of Metric to recompose these individual bucket values into a coherent Hash

class Histogram < Metric
  def observe(value, labels: {})
    base_label_set = label_set_for(labels)

    @store.synchronize do
      buckets.each do |upper_limit|
        next if value > upper_limit
        @store.increment(labels: base_label_set.merge(le: upper_limit), by: 1)
      end
      @store.increment(labels: base_label_set.merge(le: "+Inf"), by: 1)
      @store.increment(labels: base_label_set.merge(le: "sum"), by: value)
    end
  end

  # Returns all label sets with their values expressed as hashes with their buckets
  def values
    v = @store.all_values

    v.each_with_object({}) do |(label_set, v), acc|
      actual_label_set = label_set.reject{|l| l == :le }
      acc[actual_label_set] ||= @buckets.map{|b| [b, 0.0]}.to_h
      acc[actual_label_set][label_set[:le]] = v
    end
  end
end

Example usage:

let(:histogram) do
  described_class.new(:bar,
                      docstring: 'bar description',
                      labels: expected_labels,
                      buckets: [2.5, 5, 10])
end

it 'returns a hash of all recorded summaries' do
  histogram.observe(3, labels: { status: 'bar' })
  histogram.observe(5, labels: { status: 'bar' })
  histogram.observe(6, labels: { status: 'foo' })

  expect(histogram.values).to eql(
    { status: 'bar' } => { 2.5 => 0.0, 5 => 2.0, 10 => 2.0, "+Inf" => 2.0, "sum" => 8.0 },
    { status: 'foo' } => { 2.5 => 0.0, 5 => 0.0, 10 => 1.0, "+Inf" => 1.0, "sum" => 6.0 },
  )
end

For Summaries, we'd apply a similar solution.

2. Remove Quantile calculation from `Summary`

We would change Summaries to expose only sum and count instead, with no quantiles / percentiles.

Why

This is a bit of a contentious proposal in that it's not something we're doing because it'll make the client better, but because the way Summaries work is not very compatible with the idea of "Stores", which we'd need for pre-fork servers.

The quantile gem doesn't play well with this, since we'd need to store instances of Quantile::Estimator, which is a complex data structure, and tricky to share between Ruby processes.

Moreover, individual Summaries on different processes cannot be aggregated, so all processes would actually have to share one instance of this class, which makes it extremely tricky, particularly to do performantly.

Even though this is a loss of functionality, this puts the Ruby client on par with other client libraries, like the Python one, which also only offers sum and count without quantiles.

Also, this is actually more compliant with the Client Library best practices:

Summary is ENCOURAGED to offer quantiles as exports
MUST allow not having quantiles, since just count and sum is quite useful
This MUST be the default

The original client didn't comply with the last 2 rules, where this one would, just like the Python client.

And quantiles, while seemingly the point of summaries, are encouraged but not required.

We're not ruling out the idea of adding quantiles back, either they'd work only "single-process", or we may find a better way of dealing with the multiple processes.

3. Better declaration and validation of Labels

Why?

The current client enforces that labels for a metric don’t change once the metric is observed once. However, there is no way to declare what the labels should be when creating the metric, as the best practices demand. There’s also no facility to access a labeled dimension via a labels method like the best practices, allowing things like metric.labels(role: admin).inc()

What does this look like?

We propose changing the signature of a Metric’s initialize method to:

def initialize(name,
                   docstring,
                   labels: [],
                   preset_labels: {})

labels is an array of strings listing all the labels that are both allowed and required by this metric.
preset_labels are the same as the current base_labels parameter. They allow specifying “default” values for labels. The difference is that, in this proposed interface, a label that has a pre-set value would be specified in both the labels and preset_labels params.
LabelSetValidator basically changes to compare preset_labels.merge(labels).keys == labels, instead of storing the previously validated ones, and comparing against those. The rest of the validations remain basically the same.
We also propose adding a with_labels method that behaves in the way that the best practices suggest for labels(). (with_labels is a more idiomatic name in Ruby). Given a metric, calling with_labels on it will return a new Metric object with those labels added to preset_labels.
- This is possible now that we have an abstracted store. There can be multiple metrics objects for the same metric, since they all end up storing data in the same place.
- This also satisfies the recommendation that these objects are cacheable by the client. Calling with_labels(role: :admin), caching that, and then calling inc() on it multiple times will be slightly faster than calling inc({ role: :admin }, 1) multiple times, as we can skip the label validation.

     module Prometheus
      module Client
        class Metric
          # When initializing a metric we specify the list of labels that are allowed,
          # and we can specify pre-set values for some (or all) of them
          #
          # `preset_values` is the same idea as the current `base_labels`, with the 
          # difference that the label need to be specified in both `labels` and `preset_labels`
          def initialize(name,
                         docstring,
                         labels: [],
                         preset_labels: {})
            @allowed_labels = labels
            @validator = LabelSetValidator.new(allowed_labels: labels)
            @preset_labels = preset_labels
            
            @validator.valid?(@preset_labels) if @preset_labels
          end
    
          # This is the equivalent of the `labels` method specified in the best practices.
          # `with_labels` is a more idiomatic name in my opinion, and it's less confusing,
          # since `labels` could be something that lists all the allowed labels or all the
          # observed label values for the metric.
          #
          # Like the best practices mention, this can be cached by the client, for 
          # performance. This will save the time of validating the labels for every 
          # `increment` or `set`, but it won't save the time to increment the actual 
          # counter in the store, since the hash lookup still needs to happen.
          def with_labels(labels: {})
            all_labels = @preset_labels.merge(labels)
            @validator.valid?(all_labels)
    
            return self.class.new(@name,
                                  @docstring,
                                  labels: @allowed_labels,
                                  preset_labels: all_labels)
          end

          private

          def label_set_for(labels)
            @validator.validate(preset_labels.merge(labels))
          end
        end
      end
    end

NOTE: This code doesn’t quite work faster when caching the result of with_labels, for simplicity, but it's easy to make that change.

Questions:

Should we allow nil as a label value?
- I’d like to validate that all labels have a non-empty value, I think that’s a good practice.
- However, I may be missing a legitimate use case for allowing nil as a value for a label.
Should label keys be symbols or strings?
- I’d prefer them to be strings to make the Store’s life easier, but the current code is explicitly making it symbols. I think we should revert that, and call to_s inside the Metric

4. More control over how Label Errors are notified

Why

Currently, the client raises an exception when anything is wrong with labels. While any such problem should be caught in development, we wouldn’t want to 500 on a request because of some unexpected situation with labels.

Ideally, we would raise in dev / test, but notify our exception manager in production.

What does this look like?

We propose adding a label_error_strategy config option, which defaults to Raise, but that can be change by the user to whatever they need.

Something like:

    module Prometheus
      class Config
        attr_accessor :label_error_strategy
    
        def initialize
          @label_error_strategy = ErrorStrategies::Raise 
        end
      end
      
      module ErrorStrategies
        class Raise
          def self.label_error(e)
            raise e
          end
        end
      end
    end

The Prometheus Client would only provide Raise as a strategy. We might also want to provide some for Sentry / Rollbar / Bugsnag / etc, but just allowing to swap these around should be enough.

Note that the Client makes no attempt at figuring out it’s in production / dev, and deciding anything based on that. This is left for the user.

An example of using this would be:

    class RavenPrometheusLabelStrategy
      def self.label_error(e)
        Raven.notify_exception(e)
      end
    end
    
    Prometheus::Client.config.label_error_strategy = RavenPrometheusLabelStrategy

5. Using `kwargs` throughout the code

Why

We believe using keyword arguments will make the Client nicer to use, and more clear. Also, this is more idiomatic in modern Ruby.

Examples:

counter.increment(by: 2)
vs
counter.increment({}, 2)

Registry.counter(:requests_total, 
                     labels: ['code', 'app'], 
                    preset_labels: { app: 'MyApp' })

vs
Registry.counter(:requests_total, ['code', 'app'], { app: 'MyApp' })

The main point against this is that Ruby < 2.0 doesn’t support them fully, but those versions have been EOL for over 3 years now, so we shouldn't need to continue to support them.

7. Add less critical things recommended by the best practices

Counter#count_exceptions

Something like:

    class Counter
      def count_exceptions(type: StandardError)
        yield
      rescue type => e
        inc()
        raise
      end
    end

This should be used like:

    def dodgy_code
      my_counter.count_exceptions do
         # dodgy code
      end
    rescue
      # actually rescue the exception and do something useful with it
    end

Gauge#set_to_current_time

Not much explanation needed for this one

Gauge#track_in_progress

Something like:

    class Gauge
        def track_in_progress
          inc()
          yield
        ensure
          dec()
        end
      end

Gauge#time

Something like:

    class Gauge
        def time
          t = Time.now
          yield
        ensure
          set(Time.now - t)
        end
      end

Summary#time and Histogram#time

/s/set()/observe()/ in the code above

Histogram.linear_buckets and Histogram.exponential_buckets

Class methods on Histogram that return an array with bucket upper limits, for users to pass to the Histogram constructor

    Registry.histogram(name: "foo", buckets: Histogram.linear_buckets(0, 10, 10))

Standard Automatic Process Metrics

https://prometheus.io/docs/instrumenting/writing_clientlibs/#standard-and-runtime-collectors

Improve performance

The performance of the ruby client needs to be tested with benchmarks. There have been a few reports about performance issues during metrics scrape, especially when using many summaries.

Implement collector interface

The client library should follow our standard collector design and not scrape metrics directly: https://prometheus.io/docs/instrumenting/writing_clientlibs/#overall-structure

Metrics_prefix not available in released version

Hi, first of all, I would like to thank you for your awesome work.

I found

@metrics_prefix = options[:metrics_prefix] || 'http_server'
…

inside of lib/prometheus/middleware/collector.rb which enables users of this gem to customize the prefix of some default metrics.

However, this line does not present in the released version of this gem (latest version 0.7.1). After downloading the gem and view the source, I cannot find the equivalent line :

NoMethodError: undefined method `any?' for #ActionDispatch::Response::RackBody

Hi this is my config.ru file

require 'rack'
require 'prometheus/middleware/collector'
require 'prometheus/middleware/exporter'

use Rack::Deflater, if: ->(_, _, _, body) { body.any? && body[0].length > 512 }
use Prometheus::Middleware::Collector
use Prometheus::Middleware::Exporter

run Rails.application

But when I am throwing :not_found error using return head :not_found unless user it gives me

#<NoMethodError: undefined method `any?' for #ActionDispatch::Response::RackBody:0x00000003c657c0>>
/home/user/.rvm/gems/ruby-2.3.3/gems/rack-2.0.3/lib/rack/body_proxy.rb:41:in `method_missing'
/home/user/.rvm/gems/ruby-2.3.3/gems/rack-2.0.3/lib/rack/body_proxy.rb:41:in `method_missing'
/home/user/.rvm/gems/ruby-2.3.3/gems/rack-2.0.3/lib/rack/body_proxy.rb:41:in `method_missing'
/home/user/.rvm/gems/ruby-2.3.3/gems/rack-2.0.3/lib/rack/body_proxy.rb:41:in `method_missing'
/home/user/api/config.ru:5:in `block (2 levels) in <main>'
/home/user/.rvm/gems/ruby-2.3.3/gems/rack-2.0.3/lib/rack/deflater.rb:114:in `should_deflate?' 
/home/user/.rvm/gems/ruby-2.3.3/gems/rack-2.0.3/lib/rack/deflater.rb:37:in `call'
/home/user/.rvm/gems/ruby-2.3.3/gems/puma-3.8.2/lib/puma/configuration.rb:224:in `call'
/home/user/.rvm/gems/ruby-2.3.3/gems/puma-3.8.2/lib/puma/server.rb:600:in `handle_request'
/home/user/.rvm/gems/ruby-2.3.3/gems/puma-3.8.2/lib/puma/server.rb:435:in `process_client' 
/home/user/.rvm/gems/ruby-2.3.3/gems/puma-3.8.2/lib/puma/server.rb:299:in `block in run'
/home/user/.rvm/gems/ruby-2.3.3/gems/puma-3.8.2/lib/puma/thread_pool.rb:120:in `block in spawn_thread'`

How to push metrics to a secured push gateway?

I would like to push some metrics to my push gateway. However, the push gateway is secured and requires authentication. How do I do this as I cannot see any examples to do this.

Thanks :)

prometheus / client_ruby Goto Github PK

client_ruby's Introduction

Prometheus

Architecture overview

Install

Precompiled binaries

Docker images

Building from source

Service discovery plugins

Building the Docker image

Using Prometheus as a Go Library

Remote Write

Prometheus code base

React UI Development

More information

Contributing

License

client_ruby's People

Contributors

Stargazers

Watchers

Forkers

client_ruby's Issues

Objectives

1. Centralizing and Abstracting storage of Metric values

Why

What this looks like:

2. Remove Quantile calculation from Summary

Why

3. Better declaration and validation of Labels

Why?

What does this look like?

4. More control over how Label Errors are notified

Why

What does this look like?

5. Using kwargs throughout the code

Why

7. Add less critical things recommended by the best practices

Counter#count_exceptions

Gauge#set_to_current_time

Gauge#track_in_progress

Gauge#time

Summary#time and Histogram#time

Histogram.linear_buckets and Histogram.exponential_buckets

Standard Automatic Process Metrics

Recommend Projects

Recommend Topics

Recommend Org

2. Remove Quantile calculation from `Summary`

5. Using `kwargs` throughout the code