tagomoris / fluent-plugin-datacounter Goto Github PK

License: Other

Ruby 100.00%

fluent-plugin-datacounter's Introduction

fluent-plugin-datacounter

This is a plugin for Fluentd

Requirements

fluent-plugin-datacounter	fluentd	ruby
>= 1.0.0	>= v0.14.8	>= 2.1
< 1.0.0	< v0.14.0	>= 1.9

Component

DataCounterOutput

Count messages with data matches any of specified regexp patterns in specified attribute.

Counts per min/hour/day
Counts per second (average every min/hour/day)
Percentage of each pattern in total counts of messages

DataCounterOutput emits messages contains results data, so you can output these message (with 'datacount' tag by default) to any outputs you want.

output ex1 (aggregates all inputs): {"pattern1_count":20, "pattern1_rate":0.333, "pattern1_percentage":25.0, "pattern2_count":40, "pattern2_rate":0.666, "pattern2_percentage":50.0, "unmatched_count":20, "unmatched_rate":0.333, "unmatched_percentage":25.0}
output ex2 (aggregates per tag): {"test_pattern1_count":10, "test_pattern1_rate":0.333, "test_pattern1_percentage":25.0, "test_pattern2_count":40, "test_pattern2_rate":0.666, "test_pattern2_percentage":50.0, "test_unmatched_count":20, "test_unmatched_rate":0.333, "test_unmatched_percentage":25.0}

'input_tag_remove_prefix' option available if you want to remove tag prefix from output field names.

Configuration

DataCounterOutput

Count messages that have attribute 'referer' as 'google.com', 'yahoo.com' and 'facebook.com' from all messages matched, per minutes.

<match accesslog.**>
  @type datacounter
  unit minute
  aggregate all
  count_key referer
  # patternX: X(1-20)
  pattern1 google google.com
  pattern2 yahoo  yahoo.com
  pattern3 facebook facebook.com
  # but patterns above matches 'this-is-facebookXcom.xxxsite.com' ...
</match>

Or, more exact match pattern, output per tags (default 'aggregate tag'), per hours.

<match accesslog.**>
  @type datacounter
  unit hour
  count_key referer
  # patternX: X(1-20)
  pattern1 google ^http://www\.google\.com/.*
  pattern2 yahoo  ^http://www\.yahoo\.com/.*
  pattern3 twitter ^https://twitter.com/.*
</match>

HTTP status code patterns.

<match accesslog.**>
  @type datacounter
  count_interval 1m    # just same as 'unit minute' and 'count_interval 60s'
                       # you can also specify '30s', '5m', '2h' ....
  count_key status
  # patternX: X(1-20)
  pattern1 2xx ^2\d\d$
  pattern2 3xx ^3\d\d$
  pattern3 404 ^404$    # we want only 404 counts...
  pattern4 4xx ^4\d\d$  # pattern4 doesn't matches messages matches pattern[123]
  pattern5 5xx ^5\d\d$
</match>

If you want not to include 'unmatched' counts into percentage, use 'outcast_unmatched' configuration:

<match accesslog.**>
  @type datacounter
  count_key status
  # patternX: X(1-20)
  pattern1 2xx ^2\d\d$
  pattern2 3xx ^3\d\d$
  pattern3 4xx ^4\d\d$
  pattern4 5xx ^5\d\d$
  outcast_unmatched yes # '*_percentage' fields culculated without 'unmatched' counts, and
                        # 'unmatched_percentage' field will not be produced
</match>

With 'output_per_tag' option and 'tag_prefix', we get one result message for one tag, like below:

<match accesslog.{foo,bar}>
  @type datacounter
  count_key status
  pattern1 OK ^2\d\d$
  pattern2 NG ^\d\d\d$
  input_tag_remove_prefix accesslog
  output_per_tag yes
  tag_prefix status
</match>
# => tag: 'status.foo' or 'status.bar'
#    message: {'OK_count' => 60, 'OK_rate' => 1.0, 'OK_percentage' => 70, 'NG_count' => , ....}

And you can get tested messages count with 'output_messages' option:

<match accesslog.{foo,bar}>
  @type datacounter
  count_key status
  pattern1 OK ^2\d\d$
  pattern2 NG ^\d\d\d$
  input_tag_remove_prefix accesslog
  output_messages yes
</match>
# => tag: 'datacount'
#    message: {'foo_messages' => xxx, 'foo_OK_count' => ... }

<match accesslog.baz>
  @type datacounter
  count_key status
  pattern1 OK ^2\d\d$
  pattern2 NG ^\d\d\d$
  input_tag_remove_prefix accesslog
  output_per_tag yes
  tag_prefix datacount
  output_messages yes
</match>
# => tag: 'datacount.baz'
#    message: {'messages' => xxx, 'OK_count' => ...}

Parameters

count_key (required)

The key to count in the event record.
tag

The output tag. Default is datacount.
tag_prefix

The prefix string which will be added to the input tag. output_per_tag yes must be specified together.
input_tag_remove_prefix

The prefix string which will be removed from the input tag.
count_interval

The interval time to count in seconds. Default is 60.
unit

The interval time to monitor specified an unit (either of minute, hour, or day). Use either of count_interval or unit.
aggregate

Calculate in each input tag separetely, or all records in a mass. Default is tag.
ouput_per_tag

Emit for each input tag. tag_prefix must be specified together. Default is no.
outcast_unmatched

Specify yes if you do not want to include 'unmatched' counts into percentage. Default is no.
output_messages

Specify yes if you want to get tested messages. Default is no.
store_file

Store internal data into a file of the given path on shutdown, and load on starting.

TODO

consider what to do next
patches welcome!

Copyright

fluent-plugin-datacounter's People

Contributors

Stargazers

Watchers

Forkers

fujiwara newcmd001 suz-lab nagahama tiwakawa oranie sonots pombredanne kozmagabor shyouhei okkez hfm cosmo0920 nazo cybermergina kimutansk misterfxguy smartnews isfukuda sztheory iq-scm

fluent-plugin-datacounter's Issues

Pattern not matching

I want to capture strings from Prometheus logs and send as a metric to Prometheus.

Prometheus logs as as below ..

level=info ts=2018-12-13T01:22:34.889490476Z caller=main.go:491 msg="Server is ready to receive web requests."
level=info ts=2018-12-13T01:22:43.606719182Z caller=compact.go:393 component=tsdb msg="compact blocks" count=1 mint=1544623200000 maxt=1544630400000
level=info ts=2018-12-13T01:22:44.867840961Z caller=head.go:348 component=tsdb msg="head GC completed" duration=146.85719ms
level=info ts=2018-12-13T01:22:45.048967385Z caller=head.go:357 component=tsdb msg="WAL truncation completed" duration=180.821539ms
level=info ts=2018-12-13T01:22:45.096764245Z caller=compact.go:393 component=tsdb msg="compact blocks" count=1 mint=1544630400000 maxt=1544637600000
level=info ts=2018-12-13T01:22:45.159380212Z caller=head.go:348 component=tsdb msg="head GC completed" duration=1.044081ms
level=info ts=2018-12-13T01:22:45.164312043Z caller=head.go:357 component=tsdb msg="WAL truncation completed" duration=4.840133ms
level=info ts=2018-12-13T01:22:45.187817457Z caller=compact.go:393 component=tsdb msg="compact blocks" count=1 mint=1544637600000 maxt=1544644800000
level=info ts=2018-12-13T01:22:45.221853437Z caller=head.go:348 component=tsdb msg="head GC completed" duration=1.164832ms
level=info ts=2018-12-13T01:22:45.225348238Z caller=head.go:357 component=tsdb msg="WAL truncation completed" duration=3.380607ms
level=info ts=2018-12-13T01:22:45.243914874Z caller=compact.go:393 component=tsdb msg="compact blocks" count=1 mint=1544644800000 maxt=1544652000000
level=info ts=2018-12-13T01:22:45.274733718Z caller=head.go:348 component=tsdb msg="head GC completed" duration=997.842µs
level=info ts=2018-12-13T01:22:45.278033979Z caller=head.go:357 component=tsdb msg="WAL truncation completed" duration=3.204118ms
level=info ts=2018-12-13T01:22:45.297322384Z caller=compact.go:393 component=tsdb msg="compact blocks" count=1 mint=1544652000000 maxt=1544659200000
level=info ts=2018-12-13T01:22:45.327136611Z caller=head.go:348 component=tsdb msg="head GC completed" duration=1.000777ms
level=info ts=2018-12-13T01:22:45.330860266Z caller=head.go:357 component=tsdb msg="WAL truncation completed" duration=3.64782ms
level=info ts=2018-12-13T03:00:00.562411676Z caller=compact.go:393 component=tsdb msg="compact blocks" count=1 mint=1544659200000 maxt=1544666400000
level=info ts=2018-12-13T03:00:01.033471567Z caller=head.go:348 component=tsdb msg="head GC completed" duration=32.736958ms
level=info ts=2018-12-13T03:00:01.037832792Z caller=head.go:357 component=tsdb msg="WAL truncation completed" duration=4.224238ms

Now I want to capture string WAL truncation completed and want to count them how many times its appeared in promethus logs.

Along with your plugins I fluent-plugin-datacounter plugins.

my config file as below ..

# Prevent fluentd from handling records containing its own logs.
    # Do not directly collect fluentd's own logs to avoid infinite loops.
    <match fluent.**>
      @type null
    </match>
    # input plugin that exports metrics
    <source>
      @type prometheus
      bind 0.0.0.0
      port 24231
      metrics_path /metrics
    </source>
    <source>
      @type tail
      path /var/log/containers/*.log
      pos_file /var/log/fluentd-containers.log.pos
      time_format %Y-%m-%dT%H:%M:%S.%NZ
      tag kubernetes.*
      format json
      read_from_head true
    </source>
    <filter kubernetes.**>
      @type kubernetes_metadata
    </filter>
    # Clean Up the Logs from others namespace
    <match kubernetes.var.log.containers.**fluentd**.log>
      @type null
    </match>
    <match kubernetes.var.log.containers.**kube-system**.log>
      @type null
    </match>
    <match kubernetes.var.log.containers.**default**.log>
      @type null
    </match>
    <match kubernetes.var.log.containers.**openshift-infra**.log>
      @type null
    </match>
    <match kubernetes.var.log.containers.prometheus-0_openshift-metrics_prometheus-node-exporter**.log>
      @type null
    </match>
    <match kubernetes.var.log.containers.prometheus-0_openshift-metrics_alert**.log>
      @type null
    </match>
    <match kubernetes.var.log.containers.prometheus-0_openshift-metrics_fluentd**.log>
      @type null
    </match>
    <match kubernetes.var.log.containers.prometheus-0_openshift-metrics_prom-proxy**.log>
      @type null
    </match>

    <match kubernetes.var.log.containers.prometheus-0_openshift-metrics_prometheus-**.log>
      @type datacounter
      tag prom.log.counter
      count_interval 10
      aggregate all
      count_key msg
      pattern1 msg ^2\d\d$
      pattern2 compact compact
    </match>

    <filter prom.log.counter>
      @type prometheus
      <metric>
        name prom_log_counter_compact
        type counter
        desc prom log counter compact
        key compact_count
        <labels>
           host ${hostname}
        </labels>
      </metric>
      <metric>
        name prom_log_counter_wal
        type counter
        desc prom log counter wal
        key msg_count
        <labels>
           host ${hostname}
        </labels>
      </metric>
    </filter>

Metric coming as below ..

[root@masterb PK]# curl http://10.130.0.218:24231/metrics
# TYPE prom_log_counter_compact counter
# HELP prom_log_counter_compact prom log counter compact
prom_log_counter_compact{host="fluentd-ztk6x"} 0.0
# TYPE prom_log_counter_wal counter
# HELP prom_log_counter_wal prom log counter wal
prom_log_counter_wal{host="fluentd-ztk6x"} 0.0

So metrics data are not coming properly..

Please help ....

Logs from fluentd container ...

2018-12-13 06:03:53 +0000 [info]: #0 following tail of /var/log/containers/prometheus-0_openshift-metrics_alerts-proxy-f22d4108d41f820918b2761cbe68976c8b56052e62848246c771f5bf
29b3815d.log
2018-12-13 06:03:53 +0000 [info]: #0 following tail of /var/log/containers/prometheus-0_openshift-metrics_alert-buffer-a7989a84ed4ab1085c2b70aa0ea53f299aeca537cac23054d912c3c2
3a811848.log
2018-12-13 06:03:53 +0000 [info]: #0 following tail of /var/log/containers/prometheus-0_openshift-metrics_alertmanager-proxy-cf15cd2d92b2267506ba9dbe64c835126b8e628f69b0195daa
41fc651f09ac4d.log
2018-12-13 06:03:53 +0000 [info]: #0 following tail of /var/log/containers/prometheus-0_openshift-metrics_alertmanager-8fe323033b66b078d04a31540cb8ec673b76d8f5fa62207fca34dcf8
ba0eb312.log
2018-12-13 06:03:53 +0000 [info]: #0 following tail of /var/log/containers/fluentd-cd6j2_openshift-metrics_fluentd-935cc54c5993386daa737284211daf17e7efec4fb43b9366dcfc751fdfb1
7729.log
2018-12-13 06:03:53 +0000 [info]: #0 fluentd worker is now running worker=0
2018-12-13 06:04:03 +0000 [warn]: #0 no patterns matched tag="prom.log.counter"
2018-12-13 06:04:13 +0000 [warn]: #0 no patterns matched tag="prom.log.counter"
2018-12-13 06:04:24 +0000 [info]: #0 stats - namespace_cache_size: 3, pod_cache_size: 4, namespace_cache_api_updates: 4, pod_cache_api_updates: 4, id_cache_miss: 4
2018-12-13 06:04:33 +0000 [warn]: #0 no patterns matched tag="prom.log.counter"
2018-12-13 06:04:54 +0000 [info]: #0 stats - namespace_cache_size: 3, pod_cache_size: 4, namespace_cache_api_updates: 4, pod_cache_api_updates: 4, id_cache_miss: 4
2018-12-13 06:05:13 +0000 [warn]: #0 no patterns matched tag="prom.log.counter"
2018-12-13 06:05:24 +0000 [info]: #0 stats - namespace_cache_size: 3, pod_cache_size: 4, pod_cache_watch_misses: 2, namespace_cache_api_updates: 4, pod_cache_api_updates: 4, i
d_cache_miss: 4
2018-12-13 06:05:54 +0000 [info]: #0 stats - namespace_cache_size: 3, pod_cache_size: 4, pod_cache_watch_misses: 3, namespace_cache_api_updates: 4, pod_cache_api_updates: 4, i
d_cache_miss: 4
2018-12-13 06:06:24 +0000 [info]: #0 stats - namespace_cache_size: 3, pod_cache_size: 4, pod_cache_watch_misses: 3, namespace_cache_api_updates: 4, pod_cache_api_updates: 4, i
d_cache_miss: 4
2018-12-13 06:06:33 +0000 [warn]: #0 no patterns matched tag="prom.log.counter"
2018-12-13 06:06:54 +0000 [info]: #0 stats - namespace_cache_size: 3, pod_cache_size: 4, pod_cache_watch_misses: 3, namespace_cache_api_updates: 4, pod_cache_api_updates: 4, i
d_cache_miss: 4
2018-12-13 06:07:24 +0000 [info]: #0 stats - namespace_cache_size: 3, pod_cache_size: 4, pod_cache_watch_misses: 3, namespace_cache_api_updates: 4, pod_cache_api_updates: 4, i
d_cache_miss: 4
2018-12-13 06:07:54 +0000 [info]: #0 stats - namespace_cache_size: 3, pod_cache_size: 5, namespace_cache_api_updates: 5, pod_cache_api_updates: 5, id_cache_miss: 5, pod_cache_
watch_misses: 3

Unit test fails after upgrading v0.14.6 from 0.12.29

With 0.12.29

% bundle exec rake test
<snip>
Finished in 7.916504144668579 seconds.
--------------------------------------------------------------------------------
13 tests, 248 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications
100% passed
--------------------------------------------------------------------------------
1.64 tests/s, 31.33 assertions/s

With 0.14.6

$ bundle exec rake test
/Users/hhatake/.rbenv/versions/2.3.0/bin/ruby -w -I"lib:lib:test" -I"/Users/hhatake/Github/fluent-plugin-datacounter/vendor/bundle/ruby/2.3.0/gems/rake-11.2.2/lib" "/Users/hhatake/Github/fluent-plugin-datacounter/vendor/bundle/ruby/2.3.0/gems/rake-11.2.2/lib/rake/rake_test_loader.rb" "test/**/test_*.rb" 
<snip>
Loaded suite /Users/hhatake/Github/fluent-plugin-datacounter/vendor/bundle/ruby/2.3.0/gems/rake-11.2.2/lib/rake/rake_test_loader
Started
.........F
================================================================================
Failure: test_store_file(DataCounterOutputTest)
/Users/hhatake/Github/fluent-plugin-datacounter/test/plugin/test_out_datacounter.rb:706:in `block in test_store_file'
     703:       loaded_counts = d.instance.counts
     704:       loaded_saved_at = d.instance.saved_at
     705:       loaded_saved_duration = d.instance.saved_duration
  => 706:       assert_equal({}, loaded_counts)
     707:       assert_equal(nil, loaded_saved_at)
     708:       assert_equal(nil, loaded_saved_duration)
     709:     end
/Users/hhatake/Github/fluent-plugin-datacounter/vendor/bundle/ruby/2.3.0/gems/fluentd-0.14.6/lib/fluent/test/input_test.rb:123:in `block in run'
/Users/hhatake/Github/fluent-plugin-datacounter/vendor/bundle/ruby/2.3.0/gems/fluentd-0.14.6/lib/fluent/test/base.rb:71:in `run'
/Users/hhatake/Github/fluent-plugin-datacounter/vendor/bundle/ruby/2.3.0/gems/fluentd-0.14.6/lib/fluent/test/input_test.rb:122:in `run'
/Users/hhatake/Github/fluent-plugin-datacounter/test/plugin/test_out_datacounter.rb:702:in `test_store_file'
<{}> expected but was
<{"test.input"=>[3, 0, 0, 0, 0, 3]}>

diff:
? {"test.input"=>[3, 0, 0, 0, 0, 3]}
================================================================================
...

Finished in 68.84033346176147 seconds.
--------------------------------------------------------------------------------
13 tests, 246 assertions, 1 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications
92.3077% passed
--------------------------------------------------------------------------------
0.19 tests/s, 3.57 assertions/s
rake aborted!

I have no idea what causes this failure, though.... :/

example config syntax error

<match apache.**>
type datacounter
unit min
count_key status

patternX: X(1-9)

pattern1 2xx ^2\d\d$
pattern2 3xx ^3\d\d$
pattern3 404 ^404$ # we want only 404 counts...
pattern4 4xx ^4\d\d$ # pattern4 doesn't matches messages matches pattern[123]
pattern5 5xx ^5\d\d$

※http://rubydoc.info/gems/fluent-plugin-datacounter/0.1.0/frames　samples
上記のサンプルで起動しようとすると、「unit min」がsyntax errorで失敗します。

/etc/init.d/td-agent restart
Shutting down td-agent: [失敗]

Starting td-agent: 2012-02-23 13:53:53 +0900: fluent/supervisor.rb:177:rescue in main_process: config error file="/etc/td-agent/td-agent.conf" error="flowcounter unit allows minute/hour/day"

min→minute

/etc/init.d/td-agent restart
Shutting down td-agent: [失敗]
Starting td-agent: [ OK ]

集計を掛ける前に、指定キーでフィルタする設定項目の追加

指定キーに対して、指定の正規表現に該当した場合のみ、
patternXの集計を掛けるという実装の追加をご検討頂けますでしょうか。

例えば、マルチドメイン環境で、ドメイン毎のレスポンスコードの集計も
行いたいケースにこの実装があると、大変汎用性があり、有用です。

弊害としてドメインが増える毎にこのmatchセクションを追加する必要がありますが、
グラフ化（例えばgrowthforecast）する事を考えると、別のグラフの方が見やすい上に実装がシンプルです。

【仕様(案)】
prefilter_key
prefilter_pattern

【設定例】
domainというキーに対して、^headlines.yahoo.co.jp$に該当したときのみ
HTTPレスポンスの集計を行う例

<match accesslog.**>
  type datacounter
  count_interval 1m    # just same as 'unit minute' and 'count_interval 60s'
                       # you can also specify '30s', '5m', '2h' ....
  prefilter_key domain
  prefilter_pattern ^headlines¥.yahoo¥.co¥.jp$
  count_key status
  # patternX: X(1-20)
  pattern1 2xx ^2\d\d$
  pattern2 3xx ^3\d\d$
  pattern3 404 ^404$    # we want only 404 counts...
  pattern4 4xx ^4\d\d$  # pattern4 doesn't matches messages matches pattern[123]
  pattern5 5xx ^5\d\d$
<store>
....
</store>
</match>

store_file parameter error - "parameter is already removed"

Hey,

I've added the store_file parameter in order to save counter values in case pod is being restarted, and i'm getting the following error on startup:

[error]: config error file="/fluentd/etc/fluent.conf" error_class=Fluent::ObsoletedParameterError error="'store_file' parameter is already removed: Use store_storage parameter instead."

configuration:

@type "datacounter"
tag "common_log_counter"
count_key "message"
count_interval 5s
pattern1 "requestCommon requestCommon"
store_file "/var/log/fluent/datacounter"

Tried using store_storage but it doesn't seem to work since file isn't created.

tagomoris / fluent-plugin-datacounter Goto Github PK

fluent-plugin-datacounter's Introduction

fluent-plugin-datacounter

Requirements

Component

DataCounterOutput

Configuration

DataCounterOutput

Parameters

TODO

Copyright

fluent-plugin-datacounter's People

Contributors

Stargazers

Watchers

Forkers

fluent-plugin-datacounter's Issues

patternX: X(1-9)

Starting td-agent: 2012-02-23 13:53:53 +0900: fluent/supervisor.rb:177:rescue in main_process: config error file="/etc/td-agent/td-agent.conf" error="flowcounter unit allows minute/hour/day"

Recommend Projects

Recommend Topics

Recommend Org