Giter Club home page Giter Club logo

fluent-plugin-prometheus's Introduction

fluent-plugin-prometheus, a plugin for Fluentd

Build Status

A fluent plugin that instruments metrics from records and exposes them via web interface. Intended to be used together with a Prometheus server.

Requirements

fluent-plugin-prometheus fluentd ruby
1.x.y >= v1.9.1 >= 2.4
1.[0-7].y >= v0.14.8 >= 2.1
0.x.y >= v0.12.0 >= 1.9

Since v1.8.0, fluent-plugin-prometheus uses http_server helper to launch HTTP server. If you want to handle lots of connections, install async-http gem.

Installation

Add this line to your application's Gemfile:

gem 'fluent-plugin-prometheus'

And then execute:

$ bundle

Or install it yourself as:

$ gem install fluent-plugin-prometheus

Usage

fluentd-plugin-prometheus includes 6 plugins.

  • prometheus input plugin
  • prometheus_monitor input plugin
  • prometheus_output_monitor input plugin
  • prometheus_tail_monitor input plugin
  • prometheus output plugin
  • prometheus filter plugin

See sample configuration, or try tutorial.

prometheus input plugin

You have to configure this plugin to expose metrics collected by other Prometheus plugins. This plugin provides a metrics HTTP endpoint to be scraped by a Prometheus server on 24231/tcp(default).

With following configuration, you can access http://localhost:24231/metrics on a server where fluentd running.

<source>
  @type prometheus
</source>

More configuration parameters:

  • bind: binding interface (default: '0.0.0.0')
  • port: listen port (default: 24231)
  • metrics_path: metrics HTTP endpoint (default: /metrics)
  • aggregated_metrics_path: metrics HTTP endpoint (default: /aggregated_metrics)
  • content_encoding: encoding format for the exposed metrics (default: identity). Supported formats are {identity, gzip}

When using multiple workers, each worker binds to port + fluent_worker_id. To scrape metrics from all workers at once, you can access http://localhost:24231/aggregated_metrics.

TLS setting

Use <trasnport tls>. See transport config article for more details.

<source>
  @type prometheus
  <transport tls>
    # TLS parameters...
  </transport
</source>

prometheus_monitor input plugin

This plugin collects internal metrics in Fluentd. The metrics are similar to/part of monitor_agent.

Exposed metrics

  • fluentd_status_buffer_queue_length
  • fluentd_status_buffer_total_bytes
  • fluentd_status_retry_count
  • fluentd_status_buffer_newest_timekey from fluentd v1.4.2
  • fluentd_status_buffer_oldest_timekey from fluentd v1.4.2

Configuration

With following configuration, those metrics are collected.

<source>
  @type prometheus_monitor
</source>

More configuration parameters:

  • <labels>: additional labels for this metric (optional). See Labels
  • interval: interval to update monitor_agent information in seconds (default: 5)

prometheus_output_monitor input plugin

This plugin collects internal metrics for output plugin in Fluentd. This is similar to prometheus_monitor plugin, but specialized for output plugin. There are Many metrics prometheus_monitor does not include, such as num_errors, retry_wait and so on.

Exposed metrics

Metrics for output

  • fluentd_output_status_retry_count
  • fluentd_output_status_num_errors
  • fluentd_output_status_emit_count
  • fluentd_output_status_retry_wait
    • current retry_wait computed from last retry time and next retry time
  • fluentd_output_status_emit_records
  • fluentd_output_status_write_count
  • fluentd_output_status_rollback_count
  • fluentd_output_status_flush_time_count in milliseconds from fluentd v1.6.0
  • fluentd_output_status_slow_flush_count from fluentd v1.6.0

Metrics for buffer

  • fluentd_output_status_buffer_total_bytes
  • fluentd_output_status_buffer_stage_length from fluentd v1.6.0
  • fluentd_output_status_buffer_stage_byte_size from fluentd v1.6.0
  • fluentd_output_status_buffer_queue_length
  • fluentd_output_status_buffer_queue_byte_size from fluentd v1.6.0
  • fluentd_output_status_buffer_newest_timekey from fluentd v1.6.0
  • fluentd_output_status_buffer_oldest_timekey from fluentd v1.6.0
  • fluentd_output_status_buffer_available_space_ratio from fluentd v1.6.0

Configuration

With following configuration, those metrics are collected.

<source>
  @type prometheus_output_monitor
</source>

More configuration parameters:

  • <labels>: additional labels for this metric (optional). See Labels
  • interval: interval to update monitor_agent information in seconds (default: 5)
  • gauge_all: Specify metric type. If true, use gauge type. If false, use counter type. Since v2, this parameter will be removed and use counter type.

prometheus_tail_monitor input plugin

This plugin collects internal metrics for in_tail plugin in Fluentd. in_tail plugin holds internal state for files that the plugin is watching. The state is sometimes important to monitor plugins work correctly.

This plugin uses internal class of Fluentd, so it's easy to break.

Exposed metrics

  • fluentd_tail_file_position: Current bytes which plugin reads from the file
  • fluentd_tail_file_inode: inode of the file
  • fluentd_tail_file_closed: Number of closed files
  • fluentd_tail_file_opened: Number of opened files
  • fluentd_tail_file_rotated: Number of rotated files
  • fluentd_tail_file_throttled: Number of times files got throttled (only with fluentd version > 1.17)

Default labels:

  • plugin_id: a value set for a plugin in configuration.
  • type: plugin name. in_tail only for now.
  • path: file path

Configuration

With following configuration, those metrics are collected.

<source>
  @type prometheus_tail_monitor
</source>

More configuration parameters:

  • <labels>: additional labels for this metric (optional). See Labels
  • interval: interval to update monitor_agent information in seconds (default: 5)

prometheus output/filter plugin

Both output/filter plugins instrument metrics from records. Both plugins have no impact against values of each records, just read.

Assuming you have following configuration and receiving message,

<match message>
  @type stdout
</match>
message {
  "foo": 100,
  "bar": 200,
  "baz": 300
}

In filter plugin style,

<filter message>
  @type prometheus
  <metric>
    name message_foo_counter
    type counter
    desc The total number of foo in message.
    key foo
  </metric>
</filter>

<match message>
  @type stdout
</match>

In output plugin style:

<filter message>
  @type prometheus
  <metric>
    name message_foo_counter
    type counter
    desc The total number of foo in message.
    key foo
  </metric>
</filter>

<match message>
  @type copy
  <store>
    @type prometheus
    <metric>
      name message_foo_counter
      type counter
      desc The total number of foo in message.
      key foo
    </metric>
  </store>
  <store>
    @type stdout
  </store>
</match>

With above configuration, the plugin collects a metric named message_foo_counter from key foo of each records.

You can access nested keys in records via dot or bracket notation (https://docs.fluentd.org/plugin-helper-overview/api-plugin-helper-record_accessor#syntax), for example: $.kubernetes.namespace, $['key1'][0]['key2']. The record accessor is enable only if the value starts with $. or $[.

See Supported Metric Type and Labels for more configuration parameters.

Supported Metric Types

For details of each metric type, see Prometheus documentation. Also see metric name guide.

counter type

<metric>
  name message_foo_counter
  type counter
  desc The total number of foo in message.
  key foo
  <labels>
    tag ${tag}
    host ${hostname}
    foo bar
  </labels>
</metric>
  • name: metric name (required)
  • type: metric type (required)
  • desc: description of this metric (required)
  • key: key name of record for instrumentation (optional)
  • initialized: boolean controlling initilization of metric (optional). See Metric initialization
  • <labels>: additional labels for this metric (optional). See Labels
  • <initlabels>: labels to use for initialization of ReccordAccessors/Placeholder labels (optional). See Metric initialization

If key is empty, the metric values is treated as 1, so the counter increments by 1 on each record regardless of contents of the record.

gauge type

<metric>
  name message_foo_gauge
  type gauge
  desc The total number of foo in message.
  key foo
  <labels>
    tag ${tag}
    host ${hostname}
    foo bar
  </labels>
</metric>
  • name: metric name (required)
  • type: metric type (required)
  • desc: description of metric (required)
  • key: key name of record for instrumentation (required)
  • initialized: boolean controlling initilization of metric (optional). See Metric initialization
  • <labels>: additional labels for this metric (optional). See Labels
  • <initlabels>: labels to use for initialization of ReccordAccessors/Placeholder labels (optional). See Metric initialization

summary type

<metric>
  name message_foo
  type summary
  desc The summary of foo in message.
  key foo
  <labels>
    tag ${tag}
    host ${hostname}
    foo bar
  </labels>
</metric>
  • name: metric name (required)
  • type: metric type (required)
  • desc: description of metric (required)
  • key: key name of record for instrumentation (required)
  • initialized: boolean controlling initilization of metric (optional). See Metric initialization
  • <labels>: additional labels for this metric (optional). See Labels
  • <initlabels>: labels to use for initialization of ReccordAccessors/Placeholder labels (optional). See Metric initialization

histogram type

<metric>
  name message_foo
  type histogram
  desc The histogram of foo in message.
  key foo
  buckets 0.1, 1, 5, 10
  <labels>
    tag ${tag}
    host ${hostname}
    foo bar
  </labels>
</metric>
  • name: metric name (required)
  • type: metric type (required)
  • desc: description of metric (required)
  • key: key name of record for instrumentation (required)
  • initialized: boolean controlling initilization of metric (optional). See Metric initialization
  • buckets: buckets of record for instrumentation (optional)
  • <labels>: additional labels for this metric (optional). See Labels
  • <initlabels>: labels to use for initialization of ReccordAccessors/Placeholder labels (optional). See Metric initialization

Labels

See Prometheus Data Model first.

You can add labels with static value or dynamic value from records. In prometheus_monitor input plugin, you can't use label value from records.

labels section

<labels>
  key1 value1
  key2 value2
</labels>

All labels sections has same format. Each lines have key/value for label.

You can access nested fields in records via dot or bracket notation (https://docs.fluentd.org/plugin-helper-overview/api-plugin-helper-record_accessor#syntax), for example: $.kubernetes.namespace, $['key1'][0]['key2']. The record accessor is enable only if the value starts with $. or $[. Other values are handled as raw string as is and may be expanded by placeholder described later.

You can use placeholder for label values. The placeholders will be expanded from reserved values and records. If you specify ${hostname}, it will be expanded by value of a hostname where fluentd runs. The placeholder for records is deprecated. Use record accessor syntax instead.

Reserved placeholders are:

  • ${hostname}: hostname
  • ${worker_id}: fluent worker id
  • ${tag}: tag name
    • only available in Prometheus output/filter plugin
  • ${tag_parts[N]} refers to the Nth part of the tag.
    • only available in Prometheus output/filter plugin
  • ${tag_prefix[N]} refers to the [0..N] part of the tag.
    • only available in Prometheus output/filter plugin
  • ${tag_suffix[N]} refers to the [tagsize-1-N..] part of the tag.
    • where tagsize is the size of tag which is splitted with . (when tag is 1.2.3, then tagsize is 3)
    • only available in Prometheus output/filter plugin

Metric initialization

You can configure if a metric should be initialized to its zero value before receiving any event. To do so you just need to specify initialized true.

<metric>
  name message_bar_counter
  type counter
  desc The total number of bar in message.
  key bar
  initialized true
  <labels>
    foo bar
  </labels>
</metric>

If your labels contains ReccordAccessors or Placeholders, you must use <initlabels> to specify the values your ReccordAccessors/Placeholders will take. This feature is useful only if your Placeholders/ReccordAccessors contain deterministic values. Initialization will create as many zero value metrics as <initlabels> blocks you defined. Potential reserved placeholders ${hostname} and ${worker_id}, as well as static labels, are automatically added and should not be specified in <initlabels> configuration.

<metric>
  name message_bar_counter
  type counter
  desc The total number of bar in message.
  key bar
  initialized true
  <labels>
    key $.foo
    tag ${tag}
    foo bar
    worker_id ${worker_id}
  </labels>
  <initlabels>
    key foo1
    tag tag1
  </initlabels>
  <initlabels>
    key foo2
    tag tag2
  </initlabels>
</metric>
<labels>
  hostname ${hostname}
</labels>

top-level labels and labels inside metric

Prometheus output/filter plugin can have multiple metric section. Top-level labels section specifies labels for all metrics. Labels section inside metric section specifies labels for the metric. Both are specified, labels are merged.

<filter message>
  @type prometheus
  <metric>
    name message_foo_counter
    type counter
    desc The total number of foo in message.
    key foo
    <labels>
      key foo
      data_type ${type}
    </labels>
  </metric>
  <metric>
    name message_bar_counter
    type counter
    desc The total number of bar in message.
    key bar
    <labels>
      key bar
    </labels>
  </metric>
  <labels>
    tag ${tag}
    hostname ${hostname}
  </labels>
</filter>

In this case, message_foo_counter has tag, hostname, key and data_type labels.

Try plugin with nginx

Checkout repository and setup.

$ git clone git://github.com/fluent/fluent-plugin-prometheus.git
$ cd fluent-plugin-prometheus
$ bundle install --path vendor/bundle

Download pre-compiled Prometheus binary and start it. It listens on 9090.

$ wget https://github.com/prometheus/prometheus/releases/download/v1.5.2/prometheus-1.5.2.linux-amd64.tar.gz -O - | tar zxf -
$ ./prometheus-1.5.2.linux-amd64/prometheus -config.file=./misc/prometheus.yaml -storage.local.path=./prometheus/metrics

Install Nginx for sample metrics. It listens on 80 and 9999.

$ sudo apt-get install -y nginx
$ sudo cp misc/nginx_proxy.conf /etc/nginx/sites-enabled/proxy
$ sudo chmod 777 /var/log/nginx && sudo chmod +r /var/log/nginx/*.log
$ sudo service nginx restart

Start fluentd with sample configuration. It listens on 24231.

$ bundle exec fluentd -c misc/fluentd_sample.conf -v

Generate some records by accessing nginx.

$ curl http://localhost/
$ curl http://localhost:9999/

Confirm that some metrics are exported via Fluentd.

$ curl http://localhost:24231/metrics

Then, make a graph on Prometheus UI. http://localhost:9090/

Contributing

  1. Fork it ( https://github.com/fluent/fluent-plugin-prometheus/fork )
  2. Create your feature branch (git checkout -b my-new-feature)
  3. Commit your changes (git commit -am 'Add some feature')
  4. Push to the branch (git push origin my-new-feature)
  5. Create a new Pull Request

Copyright

AuthorMasahiro Sano
CopyrightCopyright (c) 2015- Masahiro Sano
LicenseApache License, Version 2.0

fluent-plugin-prometheus's People

Contributors

alexlry avatar antoinec44 avatar ashie avatar athishpranav2003 avatar bai avatar cosmo0920 avatar dependabot[bot] avatar eagletmt avatar f21 avatar ganmacs avatar gianrubio avatar gmile avatar iamjarvo avatar ippx avatar jcantrill avatar jonathanlbt1 avatar kazegusuri avatar kobtea avatar kouk avatar ljq0002 avatar mrueg avatar mtanda avatar nishanth-pinnapareddy avatar okkez avatar placydo avatar repeatedly avatar stevenjm avatar swen128 avatar william-yeh avatar yteraoka avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fluent-plugin-prometheus's Issues

`fluentd_*` metrics are not working

When I access the metrics, it only show the TYPE and HELP. All my nginx_metrics are working

# TYPE fluentd_status_buffer_queue_length gauge
# HELP fluentd_status_buffer_queue_length Current buffer queue length.
# TYPE fluentd_status_buffer_total_bytes gauge
# HELP fluentd_status_buffer_total_bytes Current total size of queued buffers.
# TYPE fluentd_status_retry_count gauge
# HELP fluentd_status_retry_count Current retry counts.
# TYPE fluentd_output_status_buffer_queue_length gauge
# HELP fluentd_output_status_buffer_queue_length Current buffer queue length.
# TYPE fluentd_output_status_buffer_total_bytes gauge
# HELP fluentd_output_status_buffer_total_bytes Current total size of queued buffers.
# TYPE fluentd_output_status_retry_count gauge
# HELP fluentd_output_status_retry_count Current retry counts.
# TYPE fluentd_output_status_num_errors gauge
# HELP fluentd_output_status_num_errors Current number of errors.
# TYPE fluentd_output_status_emit_count gauge
# HELP fluentd_output_status_emit_count Current emit counts.
# TYPE fluentd_output_status_emit_records gauge
# HELP fluentd_output_status_emit_records Current emit records.
# TYPE fluentd_output_status_write_count gauge
# HELP fluentd_output_status_write_count Current write counts.
# TYPE fluentd_output_status_rollback_count gauge
# HELP fluentd_output_status_rollback_count Current rollback counts.
# TYPE fluentd_output_status_retry_wait gauge
# HELP fluentd_output_status_retry_wait Current retry wait

All configurations:

Dockerfile:

FROM fluent/fluentd:v0.12-onbuild

USER root

RUN apk add --update --virtual .build-deps \
        sudo build-base ruby-dev \
 && sudo gem install \
        fluent-plugin-rewrite-tag-filter:1.5.6 \
        fluent-plugin-parser:0.6.1 \
        fluent-plugin-prometheus:0.3.0 \
 && sudo gem sources --clear-all \
 && apk del .build-deps \
 && rm -rf /var/cache/apk/* \
           /home/fluent/.gem/ruby/2.3.0/cache/*.gem

EXPOSE 1514

fluentd.conf

<source>
  @type prometheus
  port 9090
</source>
<source>
  @type monitor_agent
</source>
<source>
  @type prometheus_monitor
</source>
<source>
  @type prometheus_output_monitor
</source>

<source>
  @type syslog
  port 1514
  tag nginx
  message_length_limit 4096
</source>

<filter nginx.**>
  @type parser
  key_name message
  format ltsv
  time_key time
  types request_length:integer,bytes_sent:integer,request_time:float,upstream_connect_time:float,upstream_response_time:float
</filter>

<match nginx.**>
  @type prometheus
  <labels>
     server_host ${server_host}
     method ${request_method}
     status ${status}
     upstream ${upstream}
     cache_status ${upstream_cache_status}
  </labels>

  <metric>
    name nginx_log_body_total_bytes_received
    type counter
    desc nginx body total bytes received
    key request_length
  </metric>

  <metric>
    name nginx_log_body_total_bytes_sent
    type counter
    desc nginx body total bytes sent
    key bytes_sent
  </metric>

  <metric>
    name nginx_log_request_time_seconds
    type histogram
    desc nginx request time
    key request_time
    buckets 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1
  </metric>

  <metric>
    name nginx_log_upstream_response_time_seconds
    type histogram
    desc nginx upstream response time
    key upstream_response_time
    buckets 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9, 1
  </metric>

  <metric>
    name nginx_log_upstream_connect_time_seconds
    type histogram
    desc nginx upstream connect time
    key upstream_connect_time
    buckets 0.001, 0.002, 0.005, 0.01, 0.02, 0.03, 0.04, 0.05
  </metric>
</match>

Plugin not able to pass the key

I my poc, I have this kind of log---

2017-06-30 07:35:41.042584368 +0000 new_sample: {"level":"info","ts":1498808141.0409827,"caller":"sample/main.go:25","msg":"metric","crn":"crn:v1:CRN_CNAME:CRN_CTYPE:containers_kubernetes:CRN_REGION:CRN_INFRA_ID:CRN_SERVICE_NAME:CRN_SERVICE_ID:log","podName":"","sample_error_counter":1126,"counter_label_1":"counter_value_1","counter_label_2":"counter_value_2"}

But fluent promethues plugin not able to parse.
It will be really great if someone can help me here pls. thanks

Measuring fluentd incoming data volumes

We have fluentd setup as secure forwarder. we want to find out way how to measure the amount of data in bytes or bytes/sec we are getting. I see only input counter to count number of records only.
Please suggest how do we can get the objective.

<source>
  @type forward
  @id forward
  port 24225
  bind 0.0.0.0

  <security>
    self_hostname myhost
    shared_key XXXXXXXXXXXXXXXXXX
  </security>
  <transport tls>
    version TLSv1_2
    ca_path   /fluentd/etc/ssl/certs/ca_private.crt
    cert_path  /fluentd/etc/ssl/certs/XXXX.crt
    private_key_path  /fluentd/etc/ssl/private/.key
    keepalive 3600
    client_cert_auth true
  </transport>
</source>

out-prometheus doesn't refresh gauge if no receive record for a while

Hi, There is my environment:
fluentd: fluentd-0.12.40
fluent-plugin-prometheus: 0.3.0

My fluentd conf like :

  <source>
    @type prometheus
  </source>
  <source>
    @type tail
    path /a.log
    pos_file /a.pos
    tag a.log
    format /^(?<count>\d+)$/
    types count:integer
  </source>
  <match a.log>
    @type prometheus
    <metric>
      name message_warn_counter
      type gauge
      desc The total number of count in message in 10s.
      key count
    </metric>
  </match>

When I exec echo 15 >> a.log it got message_warn_counter 15.0 is correct but if I wait for 1 min and exec curl http://localhost:24231/metrics again still get message_warn_counter 15.0, I think that show absent or 0, isn't it?

How to scarpe metrics in mutliworker mode

We run fluentd with 32 workers as it is a bit loaded, does it mean I need to have 32 HTTP endpoints configured on Prometheus server starting from 24231 when

<source>
  @type prometheus
  bind 0.0.0.0
  port 24231
  metrics_path /metrics
</source>

Or is there any way to accumulate metrics on one port?

Counter should start from 0

Current implementation is not following the below rule

https://prometheus.io/docs/practices/instrumentation/#avoid-missing-metrics

To avoid this, export 0 (or NaN, if 0 would be misleading) for any time series you know may exist in advance.

This is leading to situation's such as creating prometheus alert.
For example

rate(my_metric[1m]) > 0

Will not fire alert when my_metric=1 but instead my_metric > 2.
This is because my_metric does not start with 0 and start's with 1

source path can use wildcard?

i have many docker nginx instance in k8s cluster, so We have many nginx log file in disk,
So How can I config source path , and the path support wildcard?
paths: /pods/{nginx-instance-id}/log/access.{current_date}.{current_hour}.log
and Can I use the pattern to monitor log file
path: /pods/*/log/access.*.log

warn on filter parameters

Problem 1:
After taking a close look and reading the plugin doc in this github, there is one more problem -> the logs are not even seen on http://localhost:24231/metrics

So, I am guessing the source is also not working

curl http://localhost:24231/metrics does not show anything.

Problem 2:

I am following the steps as per this blog ->
https://blog.treasuredata.com/blog/2016/07/19/routing-data-from-docker-to-prometheus-server-via-fluentd/

The entire filter tag seems to be going as warn and not used. can please help
2017-05-13 19:40:31 -0400 [warn]: parameter 'type' in <filter **>
type prometheus

name test
type counter
desc total count

is not used.
2017-05-13 19:40:31 -0400 [warn]: parameter 'name' in
name test
type counter
desc total count
is not used.
2017-05-13 19:40:31 -0400 [warn]: parameter 'type' in
name test
type counter
desc total count
is not used.
2017-05-13 19:40:31 -0400 [warn]: parameter 'desc' in
name test
type counter
desc total count
is not used.

Tag is empty

td-agent: (3.1.1-0)
fluent-plugin-prometheus: (1.0.1)

My conf:

<system>
  log_level debug
</system>
<source>
  @type monitor_agent
  bind 0.0.0.0
  port 24219
</source>
<source>
  @type    tail
  path     /var/log/syslog
  pos_file /var/spool/td-agent/syslog
  format /^(?<time>[^ ]*\s*[^ ]* [^ ]*) (?<host>[^ ]*) (@ (?<facility>[^ .]*)[.](?<priority>[^ ]*) |)(?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$/
  time_format %b %d %H:%M:%S
  tag      system.syslog
  @label   @mainstream
</source>
<source>
  @type    tail
  path     /var/log/auth.log
  pos_file /var/spool/td-agent/auth.log
  format /^(?<time>[^ ]*\s*[^ ]* [^ ]*) (?<host>[^ ]*) (@ (?<facility>[^ .]*)[.](?<priority>[^ ]*) |)(?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$/
  time_format %b %d %H:%M:%S
  tag      system.auth
  @label   @mainstream
</source>
<source>
  @type    tail
  path     /var/log/mesos-master.log
  pos_file /var/spool/td-agent/mesos-master.log
  format /^(?<time>[^ ]*\s*[^ ]* [^ ]*) (?<host>[^ ]*) (@ (?<facility>[^ .]*)[.](?<priority>[^ ]*) |)(?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$/
  time_format %b %d %H:%M:%S
  tag      mesos.master
  @label   @mainstream
</source>
<source>
  @type    tail
  path     /var/log/mesos-slave.log
  pos_file /var/spool/td-agent/mesos-slave.log
  format /^(?<time>[^ ]*\s*[^ ]* [^ ]*) (?<host>[^ ]*) (@ (?<facility>[^ .]*)[.](?<priority>[^ ]*) |)(?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$/
  time_format %b %d %H:%M:%S
  tag      mesos.slave
  @label   @mainstream
</source>
<source>
  @type    tail
  path     /var/log/marathon.log
  pos_file /var/spool/td-agent/marathon.log
  format /^(?<time>[^ ]*\s*[^ ]* [^ ]*) (?<host>[^ ]*) (@ (?<facility>[^ .]*)[.](?<priority>[^ ]*) |)(?<ident>[a-zA-Z0-9_\/\.\-]*)(?:\[(?<pid>[0-9]+)\])?(?:[^\:]*\:)? *(?<message>.*)$/
  time_format %b %d %H:%M:%S
  tag      marathon.service
  @label   @mainstream
</source>
<match **>
  @type stdout
</match>
<label @mainstream>
  # Log Forwarding
  <match **>
    @type forward
    # primary host
    <server>
      host host1.example.com
      port 24224
    </server>
    # use secondary host
    <server>
      host host2.example.com
      port 24224
      standby
    </server>
    expire_dns_cache 0
    <buffer tag>
      @type file
      flush_at_shutdown false
      flush_interval 15s
      path /var/spool/td-agent/buffer/backlog
    </buffer>
    ignore_network_errors_at_startup true
    time_as_integer true
  </match>
</label>

#plugin for monitoring fluent-agent itself with prometheus
<source>
  @type prometheus
</source>

# input plugin that collects metrics from MonitorAgent
<source>
  @type prometheus_monitor
</source>

# input plugin that collects metrics for output plugin
<source>
  @type prometheus_output_monitor
  <labels>
    hostname ${hostname}
    tag ${tag}
  </labels>
</source>

# input plugin that collects metrics for in_tail plugin
<source>
  @type prometheus_tail_monitor
</source>

My host metrics output:

# TYPE fluentd_status_buffer_queue_length gauge
# HELP fluentd_status_buffer_queue_length Current buffer queue length.
fluentd_status_buffer_queue_length{plugin_id="object:3ffd89d3e4e4",plugin_category="output",type="forward"} 5084.0
# TYPE fluentd_status_buffer_total_bytes gauge
# HELP fluentd_status_buffer_total_bytes Current total size of queued buffers.
fluentd_status_buffer_total_bytes{plugin_id="object:3ffd89d3e4e4",plugin_category="output",type="forward"} 444.0
# TYPE fluentd_status_retry_count gauge
# HELP fluentd_status_retry_count Current retry counts.
fluentd_status_retry_count{plugin_id="object:3ffd8ac6e130",plugin_category="output",type="stdout"} 0.0
fluentd_status_retry_count{plugin_id="object:3ffd89d3e4e4",plugin_category="output",type="forward"} 0.0
# TYPE fluentd_output_status_buffer_queue_length gauge
# HELP fluentd_output_status_buffer_queue_length Current buffer queue length.
fluentd_output_status_buffer_queue_length{hostname="host0.example.com",tag="",plugin_id="object:3ffd89d3e4e4",type="forward"} 5084.0
# TYPE fluentd_output_status_buffer_total_bytes gauge
# HELP fluentd_output_status_buffer_total_bytes Current total size of queued buffers.
fluentd_output_status_buffer_total_bytes{hostname="host0.example.com",tag="",plugin_id="object:3ffd89d3e4e4",type="forward"} 444.0
# TYPE fluentd_output_status_retry_count gauge
# HELP fluentd_output_status_retry_count Current retry counts.
fluentd_output_status_retry_count{hostname="host0.example.com",tag="",plugin_id="object:3ffd8ac6e130",type="stdout"} 0.0
fluentd_output_status_retry_count{hostname="host0.example.com",tag="",plugin_id="object:3ffd89d3e4e4",type="forward"} 0.0
# TYPE fluentd_output_status_num_errors gauge
# HELP fluentd_output_status_num_errors Current number of errors.
fluentd_output_status_num_errors{hostname="host0.example.com",tag="",plugin_id="object:3ffd8ac6e130",type="stdout"} 0.0
fluentd_output_status_num_errors{hostname="host0.example.com",tag="",plugin_id="object:3ffd89d3e4e4",type="forward"} 0.0
# TYPE fluentd_output_status_emit_count gauge
# HELP fluentd_output_status_emit_count Current emit counts.
fluentd_output_status_emit_count{hostname="host0.example.com",tag="",plugin_id="object:3ffd8ac6e130",type="stdout"} 3.0
fluentd_output_status_emit_count{hostname="host0.example.com",tag="",plugin_id="object:3ffd89d3e4e4",type="forward"} 4.0
# TYPE fluentd_output_status_emit_records gauge
# HELP fluentd_output_status_emit_records Current emit records.
fluentd_output_status_emit_records{hostname="host0.example.com",tag="",plugin_id="object:3ffd8ac6e130",type="stdout"} 3.0
fluentd_output_status_emit_records{hostname="host0.example.com",tag="",plugin_id="object:3ffd89d3e4e4",type="forward"} 4.0
# TYPE fluentd_output_status_write_count gauge
# HELP fluentd_output_status_write_count Current write counts.
fluentd_output_status_write_count{hostname="host0.example.com",tag="",plugin_id="object:3ffd8ac6e130",type="stdout"} 0.0
fluentd_output_status_write_count{hostname="host0.example.com",tag="",plugin_id="object:3ffd89d3e4e4",type="forward"} 16.0
# TYPE fluentd_output_status_rollback_count gauge
# HELP fluentd_output_status_rollback_count Current rollback counts.
fluentd_output_status_rollback_count{hostname="host0.example.com",tag="",plugin_id="object:3ffd8ac6e130",type="stdout"} 0.0
fluentd_output_status_rollback_count{hostname="host0.example.com",tag="",plugin_id="object:3ffd89d3e4e4",type="forward"} 0.0
# TYPE fluentd_output_status_retry_wait gauge
# HELP fluentd_output_status_retry_wait Current retry wait
fluentd_output_status_retry_wait{hostname="host0.example.com",tag="",plugin_id="object:3ffd8ac6e130",type="stdout"} 0.0
fluentd_output_status_retry_wait{hostname="host0.example.com",tag="",plugin_id="object:3ffd89d3e4e4",type="forward"} 0.0
# TYPE fluentd_tail_file_position gauge
# HELP fluentd_tail_file_position Current position of file.
fluentd_tail_file_position{plugin_id="object:3ffd88b36058",type="tail",path="/var/log/syslog"} 1641737.0
fluentd_tail_file_position{plugin_id="object:3ffd88b1f4c0",type="tail",path="/var/log/auth.log"} 7009427.0
fluentd_tail_file_position{plugin_id="object:3ffd88cf05b0",type="tail",path="/var/log/mesos-master.log"} 0.0
fluentd_tail_file_position{plugin_id="object:3ffd88ccecf8",type="tail",path="/var/log/mesos-slave.log"} 0.0
fluentd_tail_file_position{plugin_id="object:3ffd88cb2134",type="tail",path="/var/log/marathon.log"} 0.0
# TYPE fluentd_tail_file_inode gauge
# HELP fluentd_tail_file_inode Current inode of file.
fluentd_tail_file_inode{plugin_id="object:3ffd88b36058",type="tail",path="/var/log/syslog"} 524559.0
fluentd_tail_file_inode{plugin_id="object:3ffd88b1f4c0",type="tail",path="/var/log/auth.log"} 688140.0
fluentd_tail_file_inode{plugin_id="object:3ffd88cf05b0",type="tail",path="/var/log/mesos-master.log"} 0.0
fluentd_tail_file_inode{plugin_id="object:3ffd88ccecf8",type="tail",path="/var/log/mesos-slave.log"} 0.0
fluentd_tail_file_inode{plugin_id="object:3ffd88cb2134",type="tail",path="/var/log/marathon.log"} 0.0

As you can see, tag is empty. What gives?

wrong metrics types for fluentd_output_status_num_errors, et al.?

Reading the code as well as in my testing, the values of the following keys could go up and down.

fluentd_status_buffer_queue_length
fluentd_status_buffer_total_bytes
fluentd_output_status_buffer_queue_length
fluentd_output_status_buffer_total_bytes
fluentd_output_status_retry_wait

but for the rest, their values do not get reset but constantly incremented.

fluentd_status_retry_count
fluentd_output_status_emit_count
fluentd_output_status_emit_records
fluentd_output_status_num_errors
fluentd_output_status_retry_count
fluentd_output_status_rollback_count
fluentd_output_status_write_count

Considering this definition [1], isn't the type of the latter set "counter", instead of "gauge"?
[1] - https://prometheus.io/docs/concepts/metric_types/

Please note that if you scrape, the result comes with these comments.

# TYPE fluentd_status_buffer_queue_length gauge
# TYPE fluentd_status_buffer_total_bytes gauge
# TYPE fluentd_status_retry_count gauge
# TYPE fluentd_output_status_buffer_queue_length gauge
# TYPE fluentd_output_status_buffer_total_bytes gauge
# TYPE fluentd_output_status_retry_count gauge
# TYPE fluentd_output_status_num_errors gauge
# TYPE fluentd_output_status_emit_count gauge
# TYPE fluentd_output_status_emit_records gauge
# TYPE fluentd_output_status_write_count gauge
# TYPE fluentd_output_status_rollback_count gauge
# TYPE fluentd_output_status_retry_wait gauge

Another note: I'm using fluent-plugin-prometheus (1.3.0).
Thanks.

How to reset values?

I currently tried to figure out how to make this plugin reset its values. The assumption is that after a certain amount of time values become invalid and they should just reset to 0. This does however do not happen and therefor this plugin in combination with other plugins such as grepcounter are pretty much broken, since they can't actually monitor the real world values.

Is this something in the scope of this plugin, or don't you plan to support this at all?

LabelSetValidator::InvalidLabelSetError on passing metrics from fluentd

I have set multible labels on matching tag in fluentd configuration file. These keys might or might not be present in the incoming logs. I am getting the following in td-agent.log:
2018-07-10 12:53:00 +0000 [warn]: unknown placeholder ${u} found
2018-07-10 12:53:00 +0000 [warn]: unknown placeholder ${bid} found
2018-07-10 12:53:00 +0000 [warn]: unknown placeholder ${cnt} found
2018-07-10 12:53:00 +0000 [warn]: unknown placeholder ${curr} found
2018-07-10 12:53:00 +0000 [warn]: prometheus: failed to instrument a metric. error_class=Prometheus::Client::LabelSetValidator::InvalidLabelSetError error=#<Prometheus::Client::LabelSetValidator::I
nvalidLabelSetError: labels must have the same signature> tag="Tag1" name="fluentd_output_status_num_records_total"
2018-07-10 12:53:00 +0000 [warn]: dump an error event: error_class=Prometheus::Client::LabelSetValidator::InvalidLabelSetError error="labels must have the same signature" tag="Tag1" t
ime=1531227180 record={"name"=>"Tag1", "pid"=>11289, "level"=>50, "c"=>"client", "err"=>"BrokerNotAvailableError: Broker
not available", "s"=>"Unsubscribe", "tag"=>"kafka-failure", "msg"=>"Broker not available", "time"=>{}, "v"=>0}

Configuration for the corresponding tag looks like this:

<match Tag1.**>
 @type prometheus
    <metric>
      name fluentd_output_status_num_records_total
      type counter
      desc The total number of outgoing records
      <labels>
        level ${level}
        error ${err}
        user ${u}
        client ${c}
        batchID ${bid}
        count ${cnt}
        sessions ${curr}
        tag ${tag}
        hostname ${hostname}
      </labels>
    </metric>
</match>

Also, I am not able to see the corresponding metrics in prometheus. It looks like this error is blocking it.

metric value not coming

I want to capture strings from Prometheus logs and send as a metric to Prometheus.

Prometheus logs as as below ..

level=info ts=2018-12-13T01:22:34.889490476Z caller=main.go:491 msg="Server is ready to receive web requests."
level=info ts=2018-12-13T01:22:43.606719182Z caller=compact.go:393 component=tsdb msg="compact blocks" count=1 mint=1544623200000 maxt=1544630400000
level=info ts=2018-12-13T01:22:44.867840961Z caller=head.go:348 component=tsdb msg="head GC completed" duration=146.85719ms
level=info ts=2018-12-13T01:22:45.048967385Z caller=head.go:357 component=tsdb msg="WAL truncation completed" duration=180.821539ms
level=info ts=2018-12-13T01:22:45.096764245Z caller=compact.go:393 component=tsdb msg="compact blocks" count=1 mint=1544630400000 maxt=1544637600000
level=info ts=2018-12-13T01:22:45.159380212Z caller=head.go:348 component=tsdb msg="head GC completed" duration=1.044081ms
level=info ts=2018-12-13T01:22:45.164312043Z caller=head.go:357 component=tsdb msg="WAL truncation completed" duration=4.840133ms
level=info ts=2018-12-13T01:22:45.187817457Z caller=compact.go:393 component=tsdb msg="compact blocks" count=1 mint=1544637600000 maxt=1544644800000
level=info ts=2018-12-13T01:22:45.221853437Z caller=head.go:348 component=tsdb msg="head GC completed" duration=1.164832ms
level=info ts=2018-12-13T01:22:45.225348238Z caller=head.go:357 component=tsdb msg="WAL truncation completed" duration=3.380607ms
level=info ts=2018-12-13T01:22:45.243914874Z caller=compact.go:393 component=tsdb msg="compact blocks" count=1 mint=1544644800000 maxt=1544652000000
level=info ts=2018-12-13T01:22:45.274733718Z caller=head.go:348 component=tsdb msg="head GC completed" duration=997.842µs
level=info ts=2018-12-13T01:22:45.278033979Z caller=head.go:357 component=tsdb msg="WAL truncation completed" duration=3.204118ms
level=info ts=2018-12-13T01:22:45.297322384Z caller=compact.go:393 component=tsdb msg="compact blocks" count=1 mint=1544652000000 maxt=1544659200000
level=info ts=2018-12-13T01:22:45.327136611Z caller=head.go:348 component=tsdb msg="head GC completed" duration=1.000777ms
level=info ts=2018-12-13T01:22:45.330860266Z caller=head.go:357 component=tsdb msg="WAL truncation completed" duration=3.64782ms
level=info ts=2018-12-13T03:00:00.562411676Z caller=compact.go:393 component=tsdb msg="compact blocks" count=1 mint=1544659200000 maxt=1544666400000
level=info ts=2018-12-13T03:00:01.033471567Z caller=head.go:348 component=tsdb msg="head GC completed" duration=32.736958ms
level=info ts=2018-12-13T03:00:01.037832792Z caller=head.go:357 component=tsdb msg="WAL truncation completed" duration=4.224238ms

Now I want to capture string WAL truncation completed and want to count them how many times its appeared in promethus logs.

Along with your plugins I fluent-plugin-datacounter plugins.

my config file as below ..

# Prevent fluentd from handling records containing its own logs.
    # Do not directly collect fluentd's own logs to avoid infinite loops.
    <match fluent.**>
      @type null
    </match>
    # input plugin that exports metrics
    <source>
      @type prometheus
      bind 0.0.0.0
      port 24231
      metrics_path /metrics
    </source>
    <source>
      @type tail
      path /var/log/containers/*.log
      pos_file /var/log/fluentd-containers.log.pos
      time_format %Y-%m-%dT%H:%M:%S.%NZ
      tag kubernetes.*
      format json
      read_from_head true
    </source>
    <filter kubernetes.**>
      @type kubernetes_metadata
    </filter>
    # Clean Up the Logs from others namespace
    <match kubernetes.var.log.containers.**fluentd**.log>
      @type null
    </match>
    <match kubernetes.var.log.containers.**kube-system**.log>
      @type null
    </match>
    <match kubernetes.var.log.containers.**default**.log>
      @type null
    </match>
    <match kubernetes.var.log.containers.**openshift-infra**.log>
      @type null
    </match>
    <match kubernetes.var.log.containers.prometheus-0_openshift-metrics_prometheus-node-exporter**.log>
      @type null
    </match>
    <match kubernetes.var.log.containers.prometheus-0_openshift-metrics_alert**.log>
      @type null
    </match>
    <match kubernetes.var.log.containers.prometheus-0_openshift-metrics_fluentd**.log>
      @type null
    </match>
    <match kubernetes.var.log.containers.prometheus-0_openshift-metrics_prom-proxy**.log>
      @type null
    </match>

    <match kubernetes.var.log.containers.prometheus-0_openshift-metrics_prometheus-**.log>
      @type datacounter
      tag prom.log.counter
      count_interval 10
      aggregate all
      count_key msg
      pattern1 msg ^2\d\d$
      pattern2 compact compact
    </match>

    <filter prom.log.counter>
      @type prometheus
      <metric>
        name prom_log_counter_compact
        type counter
        desc prom log counter compact
        key compact_count
        <labels>
           host ${hostname}
        </labels>
      </metric>
      <metric>
        name prom_log_counter_wal
        type counter
        desc prom log counter wal
        key msg_count
        <labels>
           host ${hostname}
        </labels>
      </metric>
    </filter>

Metric coming as below ..

[root@masterb PK]# curl http://10.130.0.218:24231/metrics
# TYPE prom_log_counter_compact counter
# HELP prom_log_counter_compact prom log counter compact
prom_log_counter_compact{host="fluentd-ztk6x"} 0.0
# TYPE prom_log_counter_wal counter
# HELP prom_log_counter_wal prom log counter wal
prom_log_counter_wal{host="fluentd-ztk6x"} 0.0

So metrics data are not coming properly..

Please help ....

Logs from fluentd container ...

2018-12-13 06:03:53 +0000 [info]: #0 following tail of /var/log/containers/prometheus-0_openshift-metrics_alerts-proxy-f22d4108d41f820918b2761cbe68976c8b56052e62848246c771f5bf
29b3815d.log
2018-12-13 06:03:53 +0000 [info]: #0 following tail of /var/log/containers/prometheus-0_openshift-metrics_alert-buffer-a7989a84ed4ab1085c2b70aa0ea53f299aeca537cac23054d912c3c2
3a811848.log
2018-12-13 06:03:53 +0000 [info]: #0 following tail of /var/log/containers/prometheus-0_openshift-metrics_alertmanager-proxy-cf15cd2d92b2267506ba9dbe64c835126b8e628f69b0195daa
41fc651f09ac4d.log
2018-12-13 06:03:53 +0000 [info]: #0 following tail of /var/log/containers/prometheus-0_openshift-metrics_alertmanager-8fe323033b66b078d04a31540cb8ec673b76d8f5fa62207fca34dcf8
ba0eb312.log
2018-12-13 06:03:53 +0000 [info]: #0 following tail of /var/log/containers/fluentd-cd6j2_openshift-metrics_fluentd-935cc54c5993386daa737284211daf17e7efec4fb43b9366dcfc751fdfb1
7729.log
2018-12-13 06:03:53 +0000 [info]: #0 fluentd worker is now running worker=0
2018-12-13 06:04:03 +0000 [warn]: #0 no patterns matched tag="prom.log.counter"
2018-12-13 06:04:13 +0000 [warn]: #0 no patterns matched tag="prom.log.counter"
2018-12-13 06:04:24 +0000 [info]: #0 stats - namespace_cache_size: 3, pod_cache_size: 4, namespace_cache_api_updates: 4, pod_cache_api_updates: 4, id_cache_miss: 4
2018-12-13 06:04:33 +0000 [warn]: #0 no patterns matched tag="prom.log.counter"
2018-12-13 06:04:54 +0000 [info]: #0 stats - namespace_cache_size: 3, pod_cache_size: 4, namespace_cache_api_updates: 4, pod_cache_api_updates: 4, id_cache_miss: 4
2018-12-13 06:05:13 +0000 [warn]: #0 no patterns matched tag="prom.log.counter"
2018-12-13 06:05:24 +0000 [info]: #0 stats - namespace_cache_size: 3, pod_cache_size: 4, pod_cache_watch_misses: 2, namespace_cache_api_updates: 4, pod_cache_api_updates: 4, i
d_cache_miss: 4
2018-12-13 06:05:54 +0000 [info]: #0 stats - namespace_cache_size: 3, pod_cache_size: 4, pod_cache_watch_misses: 3, namespace_cache_api_updates: 4, pod_cache_api_updates: 4, i
d_cache_miss: 4
2018-12-13 06:06:24 +0000 [info]: #0 stats - namespace_cache_size: 3, pod_cache_size: 4, pod_cache_watch_misses: 3, namespace_cache_api_updates: 4, pod_cache_api_updates: 4, i
d_cache_miss: 4
2018-12-13 06:06:33 +0000 [warn]: #0 no patterns matched tag="prom.log.counter"
2018-12-13 06:06:54 +0000 [info]: #0 stats - namespace_cache_size: 3, pod_cache_size: 4, pod_cache_watch_misses: 3, namespace_cache_api_updates: 4, pod_cache_api_updates: 4, i
d_cache_miss: 4
2018-12-13 06:07:24 +0000 [info]: #0 stats - namespace_cache_size: 3, pod_cache_size: 4, pod_cache_watch_misses: 3, namespace_cache_api_updates: 4, pod_cache_api_updates: 4, i
d_cache_miss: 4
2018-12-13 06:07:54 +0000 [info]: #0 stats - namespace_cache_size: 3, pod_cache_size: 5, namespace_cache_api_updates: 5, pod_cache_api_updates: 5, id_cache_miss: 5, pod_cache_
watch_misses: 3

Counter metric incrementing with previous values

I am using the prometheus plugin for my counter metrics. I tried using a example in community but the counter does not provide increments as expected. When a new counter increment has to be done it also adds it with previous values which is incorrect.

My configuration file:

<source>
  @type prometheus
</source>

<source>
  @type tail
  path /var/log/test.log
  format json
  tag message
  pos_file /mnt/ibm-kube-fluentd-persist/test.pos
  types foo:integer
</source>

<match message>
  @type prometheus
  <metric>
    name message_foo_counter
    type counter
    desc The total number of foo in message.
    key foo
  </metric>
</match>

Contents of log file /var/log/test.log:

{“foo”:50,“bar”:“10”,“baz”:“20”}
{“foo”:100,“bar”:“100",“baz”:“100"}
{“foo”:50,“bar”:“100”,“baz”:“100”}

Output I am seeing:

curl http://192.168.10.3:24231/metrics
# TYPE message_foo_counter counter
# HELP message_foo_counter The total number of foo in message.
message_foo_counter 400.0

Initially the log file was empty and I kept adding log lines to increment counter value.
As you see the value of counter 400 comes due to adding up old values - (50) + (50 + 100) + (50 + 100 + 50) = 400

Cannot expose a metric from a nested key

With a receiving message:

<match message>
  @type stdout
</match>

message {
  "foo": {
    "bar": 200
  }
}

Nested key bar cannot be exposed.
The following configuration produces no metric:

<filter message>
  @type prometheus
  <metric>
    name message_bar_summary
    type summary
    desc The summary of bar in foo.
    key foo.bar
  </metric>
</filter>

It works only on root object foo :

<filter message>
  @type prometheus
  <metric>
    name message_foo_summary
    type summary
    desc The summary of foo.
    key foo
  </metric>
</filter>

But summary on the root object foo has no meaning (value is always 0).
Is it possible to expose a metric for a nested key (bar) ?

Note: record_accessor is not helpfull here (Prometheus plugin needs to access to a nested key name, not value of a nested key).

versions info:

$ td-agent-gem list |grep prom
fluent-plugin-prometheus (1.2.1)
$ td-agent --version
td-agent 0.14.25
$ /opt/td-agent/embedded/bin/ruby --version
ruby 2.4.2p198 (2017-09-14 revision 59899) [x86_64-linux]

High cpu usage with prometheus_monitor enabled

I have 2 nodes fluentd-aggregator cluster on AWS ECS. It's CPU is around 2% before I enable prometheus monitor. But whenever I enable prometheus monitor it becomes 40%

Dockerfile

FROM fluent/fluentd:v1.2.3-onbuild

RUN apk add --update --virtual .build-deps \
        sudo build-base ruby-dev \

 # cutomize following instruction as you wish
 && sudo gem install fluent-plugin-s3 \
 && sudo gem install fluent-plugin-flowcounter \
 && sudo gem install fluent-plugin-prometheus \

 && sudo gem sources --clear-all \
 && apk del .build-deps \
 && rm -rf /var/cache/apk/* \
           /home/fluent/.gem/ruby/2.3.0/cache/*.gem

COPY gems/fluent-plugin-dd-0.1.8.gem /home/fluent/

RUN gem install /home/fluent/fluent-plugin-dd-0.1.8.gem

COPY system.conf /fluentd/etc/
COPY sources.conf /fluentd/etc/

Configuration

fluentd.conf

@include system.conf
@include sources.conf

system.conf

<system>
  workers 4
</system>

source.conf

<source>
  @type forward
  port  24224
</source>

<source>
  @type debug_agent
  bind 127.0.0.1
  port 24230
</source>

<source>
  @type monitor_agent
  bind 0.0.0.0
  port 24220
</source>

<match fluent.**>
  @type stdout
</match>

# expose metrics in prometheus format
<source>
  @type prometheus
  bind 0.0.0.0
  port 24231
  metrics_path /metrics
</source>
<source>
  @type prometheus_output_monitor
  interval 10
  <labels>
    hostname ${hostname}
    worker ${worker_id}
  </labels>
</source>

Support for buffer_timekeys metric

Hi,

I've recently had a buffer_timekeys metric added to fluentd, and I'd like to add support for this to the prometheus plugin.

However, this metric takes the form of a list and I'm not sure how best to represent this as a Prometheus metric. I can see a few options:

  1. Add one gauge metric per list item, and delete these every time we refresh the list index. This seems like a hack.
  2. Imitate a histogram by calculating a set of quantiles from the list data. In implementation, this would also be a set of gauge metrics, but recalculated and updated periodically.
  3. As a special case of the above, only provide the newest and oldest timekeys. This is probably sufficient for most monitoring requirements, but ignores most of the source data.

All of these are straightforward to implement, and I'm willing to raise a PR with an implementation. I'm just interested in first finding out what you think the best approach is.

How to install plugin on k8s daemonset?

We are running one of the Kubernetes daemonset containers and would like to have this installed at pod startup. We added a postStart lifecycle rule like this:

lifecycle:
  postStart:
    exec:
      command: ["/bin/sh", "-c", "fluent-gem install fluent-plugin-prometheus"]

And then added this to our configuration:

    <source>
      @type prometheus
    </source>

But fluentd fails to start with this error:

2019-02-07 11:53:07 +0000 [error]: config error file="/fluentd/etc/fluent.conf" error_class=Fluent::ConfigError error="Unknown input plugin 'prometheus'. Run 'gem search -rd fluent-plugin' to find plugins"

How can we make this work without building our own container? It's funny because if we remove the fluentd config but leave the postStart lifecycle rule, then when we log into the pod the plugin is installed. And fluentd can be started manually without errors and it loads the plugin correctly:

root@fluentd-cloudwatch-fzq4r:/home/fluent# fluentd -c /fluentd/etc/prometheus.conf -p /fluentd/plugins --gemfile /fluentd/Gemfile
2019-02-07 11:42:44 +0000 [info]: parsing config file is succeeded path="/fluentd/etc/prometheus.conf"
2019-02-07 11:42:45 +0000 [info]: using configuration file: <ROOT>
  <source>
    @type prometheus
  </source>
</ROOT>
2019-02-07 11:42:45 +0000 [info]: starting fluentd-1.3.3 pid=51 ruby="2.3.3"
2019-02-07 11:42:45 +0000 [info]: spawn command to main:  cmdline=["/usr/bin/ruby2.3", "-Eascii-8bit:ascii-8bit", "/fluentd/vendor/bundle/ruby/2.3.0/bin/fluentd", "-c", "/fluentd/etc/prometheus.conf", "-p", "/fluentd/plugins", "--gemfile", "/fluentd/Gemfile", "--under-supervisor"]
2019-02-07 11:42:48 +0000 [info]: gem 'fluent-plugin-cloudwatch-logs' version '0.7.2'
2019-02-07 11:42:48 +0000 [info]: gem 'fluent-plugin-kubernetes_metadata_filter' version '2.1.6'
2019-02-07 11:42:48 +0000 [info]: gem 'fluent-plugin-prometheus' version '1.3.0'
2019-02-07 11:42:48 +0000 [info]: gem 'fluent-plugin-rewrite-tag-filter' version '2.1.1'
2019-02-07 11:42:48 +0000 [info]: gem 'fluent-plugin-systemd' version '1.0.1'
2019-02-07 11:42:48 +0000 [info]: gem 'fluentd' version '1.3.3'
2019-02-07 11:42:48 +0000 [info]: adding source type="prometheus"
2019-02-07 11:42:49 +0000 [info]: #0 starting fluentd worker pid=55 ppid=51 worker=0
2019-02-07 11:42:49 +0000 [info]: #0 fluentd worker is now running worker=0

What am I missing here?

prometheus_output_monitor crashes

I want to use the RecordAccessor helper in a prometheus_output_monitor plugin, but I get the error:

/var/lib/gems/2.3.0/gems/fluentd-1.2.5/lib/fluent/plugin/filter_record_transformer.rb:230:in `expand': undefined method `gsub' for #<Fluent::PluginHelper::RecordAccessor::Accessor:0x007f25a86283b8> (NoMethodError)
        from /var/lib/gems/2.3.0/gems/fluent-plugin-prometheus-1.3.0/lib/fluent/plugin/in_prometheus_output_monitor.rb:45:in `block in configure'
        from /var/lib/gems/2.3.0/gems/fluent-plugin-prometheus-1.3.0/lib/fluent/plugin/in_prometheus_output_monitor.rb:44:in `each'
        from /var/lib/gems/2.3.0/gems/fluent-plugin-prometheus-1.3.0/lib/fluent/plugin/in_prometheus_output_monitor.rb:44:in `configure'
        from /var/lib/gems/2.3.0/gems/fluentd-1.2.5/lib/fluent/plugin.rb:164:in `configure'
        from /var/lib/gems/2.3.0/gems/fluentd-1.2.5/lib/fluent/root_agent.rb:282:in `add_source'
        from /var/lib/gems/2.3.0/gems/fluentd-1.2.5/lib/fluent/root_agent.rb:122:in `block in configure'
        from /var/lib/gems/2.3.0/gems/fluentd-1.2.5/lib/fluent/root_agent.rb:118:in `each'
        from /var/lib/gems/2.3.0/gems/fluentd-1.2.5/lib/fluent/root_agent.rb:118:in `configure'
        from /var/lib/gems/2.3.0/gems/fluentd-1.2.5/lib/fluent/engine.rb:131:in `configure'
        from /var/lib/gems/2.3.0/gems/fluentd-1.2.5/lib/fluent/engine.rb:96:in `run_configure'
        from /var/lib/gems/2.3.0/gems/fluentd-1.2.5/lib/fluent/supervisor.rb:795:in `run_configure'
        from /var/lib/gems/2.3.0/gems/fluentd-1.2.5/lib/fluent/supervisor.rb:579:in `dry_run'
        from /var/lib/gems/2.3.0/gems/fluentd-1.2.5/lib/fluent/supervisor.rb:597:in `supervise'
        from /var/lib/gems/2.3.0/gems/fluentd-1.2.5/lib/fluent/supervisor.rb:502:in `run_supervisor'
        from /var/lib/gems/2.3.0/gems/fluentd-1.2.5/lib/fluent/command/fluentd.rb:310:in `<top (required)>'
        from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
        from /usr/lib/ruby/2.3.0/rubygems/core_ext/kernel_require.rb:55:in `require'
        from /var/lib/gems/2.3.0/gems/fluentd-1.2.5/bin/fluentd:8:in `<top (required)>'
        from /usr/local/bin/fluentd:22:in `load'
        from /usr/local/bin/fluentd:22:in `<main>'

Here is part of my fluentd config:

<source>
  @id prometheus_output_monitor
  @type prometheus_output_monitor
  <labels>
    pod ${hostname}
    node $.kubernetes.host # Error because of this line
  </labels>
</source>

Perhaps in the file lib/fluent/plugin/in_prometheus_output_monitor.rb you need to make such changes.

38     def configure(conf)
39       super
40       hostname = Socket.gethostname
41       expander = Fluent::Plugin::Prometheus.placeholder_expander(log)
42       placeholders = expander.prepare_placeholders({'hostname' => hostname, 'worker_id' => fluentd_worker_id})
43       @base_labels = Fluent::Plugin::Prometheus.parse_labels_elements(conf)
44       @base_labels.each do |key, value|
+++        if value.is_a?(String)
45           @base_labels[key] = expander.expand(value, placeholders)
+++        end
46       end

multiple buffer support

Hi, just curious if you are going to implement possible enhanced api for fluentd v0.14
which should provide possibility to use monitor_agent to get data from multiple buffers according to
#fluent/fluentd#855

for example if you have in configuration (with concat plugin)

<source>
  @type prometheus_monitor
</source>
...
<filter **>
        @type concat
        key message
        multiline_start_regexp /foo.bar/
        flush_interval 5
        stream_identity_key container_id
        timeout_label @OUT
</filter>

this leads to error:

2017-02-24 12:04:03 +0100 [error]: #0 NoMethodError in monitoring plugins key="buffer_queue_length" plugin=Fluent::Plugin::ConcatFilter error_class=NoMethodError error="undefined method `queue' for {}:Hash"

this is caused by
https://github.com/fluent/fluentd/blob/master/lib/fluent/plugin/in_monitor_agent.rb#L276

where @buffer is hash due to multiple buffer usage

I know this error is not fault of this plugin but just suggesting possible improvement

Information about "fluentd_output_status_emit_count" metric

Hey!

Thanks for this great plugin, we find it very useful!

I'm trying to understand what exactly fluentd_output_status_emit_count is measuring, I can't seem to find any documentation that describes it and when I curl the fluentd monitoring agent - I don't see it there at all.

Can you please provide some details about this.

Also, since this seem to be a counter - any reason it is defined as a gauge?

fluent-plugin-mongo in kubernetes

Hi,

We are running our application in kubernetes. we want to ship containers logs to mongodb for using fluent.
My config file is :
<match **>
@type mongo
log_level info
etc......

during creation of fluent we are getting following errors: error="Unknown output plugin 'mongo'. Run 'gem search -rd fluent-plugin' to find plugins".
so please tell me how to install plugin in kubernetes.

Regards,
Sathish

Documentation / Behavior with Workers

The documentation regarding multiple workers indiciates a listener will be available on all of the workers. Do all of these listeners provide the same stats, or are the stats gatherered unique per worker?

prometheus_tail_monitor change position file pos

When I use prometheus_tail_monitor, I found that Fluend pos_file is incorrect. This is because prometheus_tail_monitor will call FilePositionEntry::read_inode and FilePositionEntry::read_pos to modify the file pointer. When Fluentd is writing pos_file, it is possible to write to the wrong location. Related issues see pos file error #1953

At the same time, prometheus_tail_monitor will also iterate Fluentd tails variable. When Fluentd is updated tails, it will cause can't add a new key into hash during iteration. Related issues see can't add a new key into hash during iteration #1804

It is recommended to temporarily Disable prometheus_tail_monitor and add it when there is a better implementation.

Run as a daemon set in kubernetes cluster

Want to use as a daemon set in kubernetes cluster. So it can be use to capture ALL containers/pod logs and filter some text from different container log using "key" then using labels where I want to use container nname and namespace so I can easily graph into prometheus.

Can anyone help me on that.

Output metric count increments incorrectly

I'm using '<match>' with '<store>' to count output records metrics.
Using fluentd-v0.14.25; fluent-plugin-elasticsearch (2.11.10); fluent-plugin-prometheus (1.1.0), fluent-plugin-bufferize (0.0.2).

So I have something like this:


<match> 
   <store> @type elasticsearch </store>
   <store> @type prometheus <metric>....</metric></store>
</match>

The problem is that my output metric count increases despite an elasticsearch(i.e, first store tag) delivery failure.

Provide way to remove metric that are not published anymore

It seem currently exporter has no way to stop displaying metrics which are obsolete(never going to be updated).
For example in kubernetes cluster we can have thousands of containers recreated every hour.
We are collecting fluentd logging stats ( message rate with flowcount plugin) per each container and use prometheus to aggregate this stats. For example we have "fluentd_flowcounter_count_rate" metric and then we use prometheus labels to tag individual container/pod this metric belong to. This works fine, only problem that fluentd prometheus exporter keep showing metrics which are not published for a long time and obsolete (flowcount does not report this logfile stats anymore, log file removed, container deleted). With our rate of deletion/creation of containers output of the prometheus exporter quickly becomes polluted with large amount of obsolete metrics.

Is there a way to make fluent-plugin-prometheus stop publishing idle metrics?
Maybe it's possible to introduce extra attribute for the metric to specify idle timeout after which metric will be removed from publishing?

System info:
fluentd-0.12.34
'fluent-plugin-prometheus' : '0.3.0'

Fluentd prometheus output configuration:

<match **.log>
  @type copy
  <store>
   type flowcounter
   count_keys *
   unit minute
   aggregate tag
   output_style tagged
   delete_idle true
  </store>
</match>

<filter flowcount>
  @type record_transformer
  enable_ruby true
  remove_keys kubernetes_pod_name,kubernetes_namespace,app,job,instance,pod_template_generation,version

  <record>
    fluentd-tag ${record['tag']}
  </record>
</filter>

<filter flowcount>
  @type prometheus
  <labels>
    tag ${fluentd-tag}
  </labels>
  <metric>
    name fluentd_flowcounter_count_rate
    type gauge
    desc count rate
    key count_rate
  </metric>
</filter>

Output of the exporter:
# TYPE fluentd_flowcounter_count_rate gauge
# HELP fluentd_flowcounter_count_rate count rate
fluentd_flowcounter_count_rate{tag="kubernetes.var.log.containers.test_service.4.log"} 0.1
fluentd_flowcounter_count_rate{tag="kubernetes.var.log.containers.test_service.7.log"} 0.05
fluentd_flowcounter_count_rate{tag="kubernetes.var.log.containers.test_service.5.log"} 0.16
fluentd_flowcounter_count_rate{tag="kubernetes.var.log.containers.test_service.1.log"} 0.13
fluentd_flowcounter_count_rate{tag="kubernetes.var.log.containers.test_service.6.log"} 0.08
fluentd_flowcounter_count_rate{tag="kubernetes.var.log.containers.test_service.9.log"} 0.05
fluentd_flowcounter_count_rate{tag="kubernetes.var.log.containers.test_service.2.log"} 0.13
fluentd_flowcounter_count_rate{tag="kubernetes.var.log.containers.test_service.8.log"} 0.15
fluentd_flowcounter_count_rate{tag="kubernetes.var.log.containers.test_service.3.log"} 0.18
fluentd_flowcounter_count_rate{tag="kubernetes.var.log.containers.test_service.10.log"} 0.11

For example after we delete kubernetes.var.log.containers.test_service.10.log and kubernetes.var.log.containers.test_service.9.log they still will be displayed by exporter even they never will be updated anymore.

prometheus_monitor not updating metrics

Currently testing fluentd-0.12.33 with fluent-plugin-prometheus-0.2.1 & prometheus-client-0.6.0 in a Kubernetes cluster. I don't believe metrics are being set in update_monitor_info. All requests to the scrape URL only ever returns the TYPE & HELP entries, but never any actual metrics (see below). I have the following minimal fluentd config;

  <source>
    @type prometheus
  </source>
  <source>
    @type prometheus_monitor
  </source>
curl http://xxx.xxx.xxx.xxx:24231/metrics/         
# TYPE fluentd_status_buffer_queue_length gauge
# HELP fluentd_status_buffer_queue_length Current buffer queue length.
# TYPE fluentd_status_buffer_total_bytes gauge
# HELP fluentd_status_buffer_total_bytes Current total size of queued buffers.
# TYPE fluentd_status_retry_count gauge
# HELP fluentd_status_retry_count Current retry counts.

Where do I start debugging this?

Thanks,
Grant

Metrics filter doesn't support the ${tags_part[X]} config syntax

When trying to use the tag_parts config syntax to add some labels it doesn't work. The following config should add a label "environment" based on the second matching part of the tag.

<filter *.* >
  @type prometheus
  <metric>
    name fluentd_input_status_num_records_total
    type counter
    desc The total number of incoming records
    <labels>
      environment ${tag_parts[1]}
      hostname ${hostname}
    </labels>
  </metric>
</filter>

v1.2.0 configure error

I've a below simple fluentd config -

<system>
    @log_level debug
</system>

# Listen to events on Unix domain socket
<source>
  @type unix
  path /var/run/events.sock
  @log_level debug
</source>

# Monitor retries and emit events from various plugins
<source>
  @type prometheus_output_monitor
  interval 10
  <labels>
    hostname ${hostname}
  </labels>
  @id output_monitor
</source>

# Enable prometheus metrics
<source>
  @type prometheus
  bind 0.0.0.0
  port 24231
  metrics_path /prometheus/metrics
</source>

# Filter plugin to count incoming events
<filter events.*>
  @type prometheus
  @id prometheus_filter_plugin
  <metric>
    name fluentd_events_input_num_records_total
    type counter
    desc The total number of incoming records
    <labels>
      tag ${tag}
      hostname ${hostname}
    </labels>
  </metric>
</filter>

<match events.*>
  @type copy
  <store>
    # count outgoing events
    @id prometheus_output_plugin
    @type prometheus
    <metric>
      name fluentd_events_output_num_records_total
      type counter
      desc The total number of outgoing events
      <labels>
        tag ${tag}
        hostname ${hostname}
      </labels>
    </metric>
  </store>
  <store>
    @type stdout
  </store>
</match>

And I get the below error when starting fluentd v0.12 -

/usr/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/plugin/filter_record_transformer.rb:230:in `expand': undefined method `gsub' for #<Fluent::PluginHelper::RecordAccessor::Accessor:0x00007fc76fd95e70> (NoMethodError)
        from /usr/lib/ruby/gems/2.4.0/gems/fluent-plugin-prometheus-1.2.0/lib/fluent/plugin/in_prometheus_output_monitor.rb:45:in `block in configure'
        from /usr/lib/ruby/gems/2.4.0/gems/fluent-plugin-prometheus-1.2.0/lib/fluent/plugin/in_prometheus_output_monitor.rb:44:in `each'
        from /usr/lib/ruby/gems/2.4.0/gems/fluent-plugin-prometheus-1.2.0/lib/fluent/plugin/in_prometheus_output_monitor.rb:44:in `configure'
        from /usr/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/plugin.rb:164:in `configure'
        from /usr/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/root_agent.rb:282:in `add_source'
        from /usr/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/root_agent.rb:122:in `block in configure'
        from /usr/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/root_agent.rb:118:in `each'
        from /usr/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/root_agent.rb:118:in `configure'
        from /usr/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/engine.rb:131:in `configure'
        from /usr/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/engine.rb:96:in `run_configure'
        from /usr/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/supervisor.rb:795:in `run_configure'
        from /usr/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/supervisor.rb:579:in `dry_run'
        from /usr/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/supervisor.rb:597:in `supervise'
        from /usr/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/supervisor.rb:502:in `run_supervisor'
        from /usr/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/lib/fluent/command/fluentd.rb:310:in `<top (required)>'
        from /usr/lib/ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require'
        from /usr/lib/ruby/2.4.0/rubygems/core_ext/kernel_require.rb:55:in `require'
        from /usr/lib/ruby/gems/2.4.0/gems/fluentd-1.2.6/bin/fluentd:8:in `<top (required)>'
        from /usr/bin/fluentd:23:in `load'
        from /usr/bin/fluentd:23:in `<main>'

Versions
Fluentd - v0.12
fluent-plugin-prometheus - 1.2.0

Input plugin does not expose port on arbitrary node

I'm starting fluentd with prometheus plugin as a daemonset in GKE. Often, when the container is started, port 24231 does not open, although on other nodes the ports are normally exposed. When I enter the ss -lnt command, I do not see open ports in container.

There are no suspicious messages in the logs even in trace mode.

How I can debug this?

fluentd metrics are not shown on Prometheus

Hello kazegusuri,
I am trying to run fluent-plugin-prometheus on k8s, I see fluentd pod on k8s targets, It is up. but I couldn't see any metrics related docker_command_log to execute on prometheus. Please, can you help this issue?
Please find my files in below.

Thanks,

Dockerfile:

FROM fluent/fluentd:v0.12-onbuild

USER root

RUN apk add --update --virtual .build-deps \
        sudo build-base ruby-dev \
 && sudo gem install \
        fluent-plugin-rewrite-tag-filter:1.5.6 \
        fluent-plugin-parser:0.6.1 \
        fluent-plugin-prometheus:0.3.0 \
 && sudo gem sources --clear-all \
 && apk del .build-deps \
 && rm -rf /var/cache/apk/* \
           /home/fluent/.gem/ruby/2.3.0/cache/*.gem

# listen port 24224 for Fluentd forward protocol
EXPOSE 24224
# prometheus plugin expose metrics at 0.0.0.0:24231/metrics
EXPOSE 24231

fluent.conf

<source>
  @type forward
</source>
<source>
  @type prometheus
</source>

<filter docker.**>
  @type prometheus
  <metric>
    name docker_command_log
    type counter
    desc total docker commands
    #key log
  </metric>
</filter>

<match docker.**>
  @type copy
  <store>
    @type stdout
  </store>
</match>

k8s-development.yaml

...
apiVersion: v1
kind: Deployment
metadata:
  name: fluentd
spec:
  template:
    metadata:
      labels:
        app: fluentd
      name: fluentd
    spec:
      containers:
      - image: myimage
        name: fluentd
        ports:
        - containerPort: 24231
          hostPort: 24231

prometheus.yml

...
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'prometheus'

    # Override the global default and scrape targets from this job every 5 seconds.
    scrape_interval: 5s

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    static_configs:
         - targets: ['localhost:9090','fluentd:24231']

fluentd_output_status_buffer_total_bytes metric returns negative (minus) results

We are running fluentd v1.2.2 and fluent-plugin-prometheus v1.0.1 and we've identified few servers that report negative numbers for the fluentd_output_status_buffer_total_bytes metric.

Element Value
fluentd_output_status_buffer_total_bytes{cluster="abc",datacenter="xyz",env="prd",host="kubw-48",instance="x.x.x.x:24231",job="kubernetes-pods",k8s_app="fluentd-logging",kubernetes_namespace="kube-system",kubernetes_pod_name="fluentd-worker-35c9d",kubernetes_pod_node_name="kubw-48",node="kubw-48",plugin_id="object:3fe5d891e918",role="worker",type="forward"} -19394669

image

I'm not sure if it's the problem of the plugin or fluentd itself, since I believe the plugin collects the metrics from fluentd monitoring agent.

Fluentd custom metrics are not working

I've put print outs to the instrument method at 162nd line,

@counter.increment(labels(record, expander, placeholders), value)

puts labels(record, expander, placeholders)
puts value

@counter.increment(labels(record, expander, placeholders), value)

here is the output,

$ fluentd -c /etc/fluent/fluent.conf
2019-01-16 11:51:22.000000000 +0300 app.test: {"test":1,"number":2958}
{:tag=>"app.test", :worker=>"0", :hostname=>"fluentd-test"}
1
2019-01-16 11:51:24.000000000 +0300 app.test: {"test":1,"number":2959}
{:tag=>"app.test", :worker=>"0", :hostname=>"fluentd-test"}
1
2019-01-16 11:51:26.000000000 +0300 app.test: {"test":1,"number":2960}
{:tag=>"app.test", :worker=>"0", :hostname=>"fluentd-test"}
1
2019-01-16 11:51:28.000000000 +0300 app.test: {"test":1,"number":2961}
{:tag=>"app.test", :worker=>"0", :hostname=>"fluentd-test"}
1
2019-01-16 11:51:30.000000000 +0300 app.test: {"test":1,"number":2962}
{:tag=>"app.test", :worker=>"0", :hostname=>"fluentd-test"}
1
2019-01-16 11:51:32.000000000 +0300 app.test: {"test":1,"number":2963}

seems ok right? but metrics endpoint is different,

# TYPE fluentd_test_input_status_num_records_total counter
# HELP fluentd_test_input_status_num_records_total The total number of incoming records

configuration

<system>
  workers 8                    # also can be specified by `--workers` command line option
  root_dir /etc/fluent         # strongly recommended to specify this
</system>

<source>
  @type forward
  port 24200
</source>

<source>
  @type monitor_agent
  @id monitor_agent_input

  port 24240
</source>

<source>
  @type prometheus
  bind 0.0.0.0
  port 24231
  metrics_path /metrics
</source>

<filter app.**>
  @type prometheus
  <metric>
    name fluentd_test_input_status_num_records_total
    type counter
    desc The total number of incoming records
    <labels>
      tag ${tag}
      worker ${worker_id}
      hostname ${hostname}
    </labels>
  </metric>
</filter>

<match app.**>
    @type stdout
</match>

gems

$ gem list
fluent-plugin-prometheus (1.3.0)
fluentd (1.3.3)
prometheus-client (0.8.0)

prometheus_output_monitor not updating `num_errors`

I'm trying to understand if my expectation of the value num_errors is wrong, or if I'm doing something wrong in the setup.

My intention was to use num_errors from my elasticsearch_dynamic output plugin to account for rejections when pushing to es. I've modified the elasticsearch_dynamic code so it correctly emits an error event using router.emit_error_event when it gets a 400 response back.
But it seems that the counter metric for num_errors of this plugin is constantly 0.

Am I wrong in assuming that router.emit_error_event would increase it?
Otherwise, if I'm sure router.emit_error_event is getting called properly, any suggestion of what kind of misconfiguration could lead to this?

fluentd input metrics are not shown

I am using this article as a reference. https://docs.fluentd.org/v1.0/articles/monitoring-prometheus. with this configuration. fluentd_input_status_num_records_total does not show up on prometheus. However fluentd_output_status_buffer_queue_length are being shown with Prometheus output plugin. Filter plugin is not working.

  @type prometheus
  <metric>
    name fluentd_input_status_num_records_total
    type counter
    desc The total number of incoming records
    <labels>
      tag ${tag}
      hostname ${hostname}
    </labels>
  </metric>
</filter> ```

Broken with fluentd >= 0.14.4

This commit moved MonitorAgentInput to Fluent::Plugin & so this plugin is broken with fluentd >= 0.14.4.

The error message is unexpected error error="uninitialized constant Fluent::MonitorAgentInput"

Fluentd dies when executing example

When executing

bundle exec fluentd -c misc/fluentd_sample.conf -v

it dies:

2016-03-29 15:09:14 -0400 [info]: fluent/supervisor.rb:465:read_config: reading config file path="misc/fluentd_sample.conf"
2016-03-29 15:09:14 -0400 [info]: fluent/supervisor.rb:331:supervise: starting fluentd-0.12.22
2016-03-29 15:09:14 -0400 [info]: fluent/engine.rb:126:block in configure: gem 'fluentd' version '0.12.22'
2016-03-29 15:09:14 -0400 [info]: fluent/engine.rb:126:block in configure: gem 'fluent-plugin-prometheus' version '0.1.1'
2016-03-29 15:09:14 -0400 [info]: fluent/agent.rb:140:add_filter: adding filter pattern="nginx" type="prometheus"
2016-03-29 15:09:14 -0400 [info]: fluent/agent.rb:128:add_match: adding match pattern="nginx" type="copy"
2016-03-29 15:09:14 -0400 [debug]: plugin/out_copy.rb:44:block in configure: adding store type="stdout"
2016-03-29 15:09:14 -0400 [info]: fluent/agent.rb:140:add_filter: adding filter pattern="nginx_proxy" type="prometheus"
2016-03-29 15:09:14 -0400 [info]: fluent/agent.rb:128:add_match: adding match pattern="nginx_proxy" type="copy"
2016-03-29 15:09:14 -0400 [debug]: plugin/out_copy.rb:44:block in configure: adding store type="stdout"
2016-03-29 15:09:14 -0400 [info]: fluent/root_agent.rb:147:add_source: adding source type="prometheus"
2016-03-29 15:09:14 -0400 [info]: fluent/root_agent.rb:147:add_source: adding source type="prometheus_monitor"
2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:372:rescue in main_process: unexpected error error="wrong number of arguments (3 for 1)"
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/gems/fluentd-0.12.22/lib/fluent/plugin/filter_record_transformer.rb:195:in `prepare_placeholders'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/lib/fluent/plugin/in_prometheus_monitor.rb:20:in `configure'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/gems/fluentd-0.12.22/lib/fluent/root_agent.rb:154:in `add_source'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/gems/fluentd-0.12.22/lib/fluent/root_agent.rb:95:in `block in configure'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/gems/fluentd-0.12.22/lib/fluent/root_agent.rb:92:in `each'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/gems/fluentd-0.12.22/lib/fluent/root_agent.rb:92:in `configure'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/gems/fluentd-0.12.22/lib/fluent/engine.rb:129:in `configure'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/gems/fluentd-0.12.22/lib/fluent/engine.rb:103:in `run_configure'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/gems/fluentd-0.12.22/lib/fluent/supervisor.rb:483:in `run_configure'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/gems/fluentd-0.12.22/lib/fluent/supervisor.rb:154:in `block in start'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/gems/fluentd-0.12.22/lib/fluent/supervisor.rb:360:in `call'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/gems/fluentd-0.12.22/lib/fluent/supervisor.rb:360:in `main_process'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/gems/fluentd-0.12.22/lib/fluent/supervisor.rb:333:in `block in supervise'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/gems/fluentd-0.12.22/lib/fluent/supervisor.rb:332:in `fork'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/gems/fluentd-0.12.22/lib/fluent/supervisor.rb:332:in `supervise'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/gems/fluentd-0.12.22/lib/fluent/supervisor.rb:150:in `start'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/gems/fluentd-0.12.22/lib/fluent/command/fluentd.rb:173:in `<top (required)>'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/gems/fluentd-0.12.22/bin/fluentd:5:in `require'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/gems/fluentd-0.12.22/bin/fluentd:5:in `<top (required)>'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/bin/fluentd:23:in `load'
  2016-03-29 15:09:14 -0400 [error]: fluent/supervisor.rb:332:fork: /root/fluent-plugin-prometheus/vendor/bundle/ruby/1.9.1/bin/fluentd:23:in `<main>'
2016-03-29 15:09:14 -0400 [info]: fluent/supervisor.rb:348:supervise: process finished code=256
2016-03-29 15:09:14 -0400 [warn]: fluent/supervisor.rb:351:supervise: process died within 1 second. exit.

anything I might be doing wrong?

fluentd dies with an error

In my case fluentd can't start when enabled prometheus filter.
Config:

<source>
  type unix
  path td-agent.sock
</source>

<source>
  type prometheus
</source>

<source>
  type tail
  format nginx
  tag nginx
  path /var/log/nginx/access.log
  pos_file fluent_nginx.pos
  types size:integer
</source>

<filter nginx>
  type prometheus

  # You can use counter type with specifying a key,
  # and increments counter by the value
  <metric>
    name nginx_size_counter_bytes
    type counter
    desc nginx bytes sent
    key size
    <labels>
       host ${hostname}
    </labels>
  </metric>

  # You can use counter type without specifying a key
  # This just increments counter by 1
  <metric>
    name nginx_record_counts
    type counter
    desc the number of emited records
    <labels>
       host ${hostname}
    </labels>
  </metric>
</filter>

<system>
  log_level debug
</system>

The error message from log:

[scripter@dev-main:~]$ fluentd -c fluent.conf
2016-01-14 19:19:36 +0300 [info]: reading config file path="fluent.conf"
2016-01-14 19:19:36 +0300 [info]: starting fluentd-0.12.19
2016-01-14 19:19:36 +0300 [info]: gem 'fluent-plugin-influxdb' version '0.2.2'
2016-01-14 19:19:36 +0300 [info]: gem 'fluent-plugin-prometheus' version '0.1.0'
2016-01-14 19:19:36 +0300 [info]: gem 'fluentd' version '0.12.19'
2016-01-14 19:19:36 +0300 [info]: adding match pattern="message" type="copy"
2016-01-14 19:19:36 +0300 [debug]: adding store type="prometheus"
2016-01-14 19:19:36 +0300 [error]: unexpected error error="undefined method `[]' for #<Fluent::Log:0x00000002362a20>"
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/plugin/filter_record_transformer.rb:174:in `initialize'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluent-plugin-prometheus-0.1.0/lib/fluent/plugin/prometheus.rb:49:in `new'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluent-plugin-prometheus-0.1.0/lib/fluent/plugin/prometheus.rb:49:in `placeholder_expnader'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluent-plugin-prometheus-0.1.0/lib/fluent/plugin/prometheus.rb:65:in `configure'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluent-plugin-prometheus-0.1.0/lib/fluent/plugin/out_prometheus.rb:14:in `configure'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/plugin/out_copy.rb:44:in `block in configure'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/plugin/out_copy.rb:33:in `each'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/plugin/out_copy.rb:33:in `configure'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/agent.rb:129:in `add_match'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/agent.rb:60:in `block in configure'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/agent.rb:54:in `each'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/agent.rb:54:in `configure'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/root_agent.rb:82:in `configure'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/engine.rb:117:in `configure'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/engine.rb:91:in `run_configure'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/supervisor.rb:515:in `run_configure'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/supervisor.rb:146:in `block in start'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/supervisor.rb:352:in `call'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/supervisor.rb:352:in `main_process'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/supervisor.rb:325:in `block in supervise'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/supervisor.rb:324:in `fork'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/supervisor.rb:324:in `supervise'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/supervisor.rb:142:in `start'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/lib/fluent/command/fluentd.rb:171:in `<top (required)>'
  2016-01-14 19:19:36 +0300 [error]: /usr/lib/ruby/1.9.1/rubygems/custom_require.rb:36:in `require'
  2016-01-14 19:19:36 +0300 [error]: /usr/lib/ruby/1.9.1/rubygems/custom_require.rb:36:in `require'
  2016-01-14 19:19:36 +0300 [error]: /var/lib/gems/1.9.1/gems/fluentd-0.12.19/bin/fluentd:6:in `<top (required)>'
  2016-01-14 19:19:36 +0300 [error]: /usr/local/bin/fluentd:23:in `load'
  2016-01-14 19:19:36 +0300 [error]: /usr/local/bin/fluentd:23:in `<main>'
2016-01-14 19:19:36 +0300 [info]: process finished code=256
2016-01-14 19:19:36 +0300 [warn]: process died within 1 second. exit.

Seems like there is an error in 49 and 56 lines of prometheus.rb. Class PlaceholderExpander has two parameters according to declaration in filter_record_transformer.rb, but in prometheus.rb constructor of PlaceholderExpander is calling with only one parameter (log).

I tried to replace new(log) on new({ :log => log }) and everything works well after that. I'm not a ruby-professional and not a fluent-expert and I don't fully understand what I did, so I kindly ask you to check it out :)

Support multi process workers

Fluentd v0.14.12 introduced supports multi process workers.
This plugin isn't ready for multi process workers, so current workaround is to use in configuration but this result in losing metrics from other workers.

Is there any plan to bring multi-process worker support to fluent-plugin-prometheus?

unexpected error error="wrong number of arguments (1 for 3)"

When launching fluentd with the exporter plugin installed I get the following error:

2016-06-01 22:59:45 +0000 [error]: unexpected error error="wrong number of arguments (1 for 3)"
  2016-06-01 22:59:45 +0000 [error]: /usr/lib/ruby/gems/2.2.0/gems/fluentd-0.14.0.pre.1/lib/fluent/plugin/filter_record_transformer.rb:172:in `prepare_placeholders'
  2016-06-01 22:59:45 +0000 [error]: /usr/lib/ruby/gems/2.2.0/gems/fluent-plugin-prometheus-0.1.2/lib/fluent/plugin/in_prometheus_monitor.rb:20:in `configure'

Looks like this is due to a typo in https://github.com/kazegusuri/fluent-plugin-prometheus/blob/master/lib/fluent/plugin/in_prometheus_monitor.rb#L19

placeholder_expnader should read placeholder_expander

fluentd_build_info

For many exporters, there's a bit of metadata that is added to the metrics. For example from Prometheus itself

# HELP prometheus_build_info A metric with a constant '1' value labeled by version, revision, branch, and goversion from which prometheus was built.
# TYPE prometheus_build_info gauge
prometheus_build_info{branch="HEAD",goversion="go1.9.1",revision="3a7c51ab70fc7615cd318204d3aa7c078b7c5b20",version="1.8.1"} 1

or Grafana

# HELP grafana_info Information about the Grafana
# TYPE grafana_info gauge
grafana_info{version="4.6.1"} 1

Would you accept a pull request to add similar metadata to this plugin?
I was thinking something like

fluentd_build_info{rubyversion="??", tdagentversion="??"} 1

Would be useful to help see what was deployed

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.