Giter Club home page Giter Club logo

fluent-plugin-multiline-parser's Introduction

fluent-plugin-multiline-parser

Component

ParserOutput

This is a Fluentd plugin to parse strings in log messages and re-emit them. This parser also supports multiline format.

ParserOutput

ParserOutput has just same with 'in_tail' about 'format' and 'time_format':

<match raw.apache.common.*>
  @type parser
  remove_prefix raw
  format /^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)$/
  time_format %d/%b/%Y:%H:%M:%S %z
  key_name message
</match>

Of course, you can use predefined format 'apache' and 'syslog':

<match raw.apache.combined.*>
  @type parser
  remove_prefix raw
  format apache
  key_name message
</match>

If you want to parse multiline log:

<filter raw.java.*>
  @type parser
  format multiline
  format_firstline /\d{4}-\d{1,2}-\d{1,2}/
  format1 /^(?<time>\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}) \[(?<thread>.*)\] (?<level>[^\s]+)(?<message>.*)/
  key_name message
</filter>

fluent-plugin-multiline-parser uses parser plugins of Fluentd (and your own customized parser plugin). See document page for more details: http://docs.fluentd.org/articles/parser-plugin-overview

If you want original attribute-data pair in re-emitted message, specify 'reserve_data':

<match raw.apache.*>
  @type parser
  tag apache
  format apache
  key_name message
  reserve_data yes
</match>

If you want to suppress 'pattern not match' log, specify 'suppress_parse_error_log true' to configuration. default value is false.

<match in.hogelog>
  @type parser
  tag hogelog
  format /^col1=(?<col1>.+) col2=(?<col2>.+)$/
  key_name message
  suppress_parse_error_log true
</match>

To store parsed values with specified key name prefix, use inject_key_prefix option:

<match raw.sales.*>
  @type parser
  tag sales
  format json
  key_name sales
  reserve_data      yes
  inject_key_prefix sales.
</match>
# input string of 'sales': {"user":1,"num":2}
# output data: {"sales":"{\"user\":1,\"num\":2}","sales.user":1, "sales.num":2}

To store parsed values as a hash value in a field, use hash_value_field option:

<match raw.sales.*>
  @type parser
  tag sales
  format json
  key_name sales
  hash_value_field parsed
</match>
# input string of 'sales': {"user":1,"num":2}
# output data: {"parsed":{"user":1, "num":2}}

Other options (ex: reserve_data, inject_key_prefix) are available with hash_value_field.

# output data: {"sales":"{\"user\":1,\"num\":2}", "parsed":{"sales.user":1, "sales.num":2}}

Not to parse times (reserve that field like 'time' in record), specify time_parse no:

<match raw.sales.*>
  type parser
  tag sales
  format json
  key_name sales
  hash_value_field parsed
  time_parse no
</match>
# input string of 'sales': {"user":1,"num":2,"time":"2013-10-31 12:48:33"}
# output data: {"parsed":{"user":1, "num":2,"time":"2013-10-31 12:48:33"}}

ParserFilter

This is the filter version of ParserOutput.

Note that this filter version of parser plugin does not have modifing tag functionality.

ParserFilter has just same with 'in_tail' about 'format' and 'time_format':

<filter raw.apache.common.*>
  @type parser
  format /^(?<host>[^ ]*) [^ ]* (?<user>[^ ]*) \[(?<time>[^\]]*)\] "(?<method>\S+)(?: +(?<path>[^ ]*) +\S*)?" (?<code>[^ ]*) (?<size>[^ ]*)$/
  time_format %d/%b/%Y:%H:%M:%S %z
  key_name message
</filter>

Of course, you can use predefined format 'apache' and 'syslog':

<filter raw.apache.combined.*>
  @type parser
  format apache
  key_name message
</filter>

fluent-plugin-multiline-parser uses parser plugins of Fluentd (and your own customized parser plugin). See document page for more details: http://docs.fluentd.org/articles/parser-plugin-overview

If you want to parse multiline log:

<filter raw.java.*>
  @type parser
  format multiline
  format_firstline /\d{4}-\d{1,2}-\d{1,2}/
  format1 /^(?<time>\d{4}-\d{1,2}-\d{1,2} \d{1,2}:\d{1,2}:\d{1,2}) \[(?<thread>.*)\] (?<level>[^\s]+)(?<message>.*)/
  key_name message
</filter>

If you want original attribute-data pair in re-emitted message, specify 'reserve_data':

<filter raw.apache.*>
  @type parser
  format apache
  key_name message
  reserve_data yes
</filter>

If you want to suppress 'pattern not match' log, specify 'suppress_parse_error_log true' to configuration. default value is false.

<filter in.hogelog>
  @type parser
  format /^col1=(?<col1>.+) col2=(?<col2>.+)$/
  key_name message
  suppress_parse_error_log true
</filter>

To store parsed values with specified key name prefix, use inject_key_prefix option:

<filter raw.sales.*>
  @type parser
  format json
  key_name sales
  reserve_data      yes
  inject_key_prefix sales.
</filter>
# input string of 'sales': {"user":1,"num":2}
# output data: {"sales":"{\"user\":1,\"num\":2}","sales.user":1, "sales.num":2}

To store parsed values as a hash value in a field, use hash_value_field option:

<filter raw.sales.*>
  @type parser
  tag sales
  format json
  key_name sales
  hash_value_field parsed
</filter>
# input string of 'sales': {"user":1,"num":2}
# output data: {"parsed":{"user":1, "num":2}}

Other options (ex: reserve_data, inject_key_prefix) are available with hash_value_field.

# output data: {"sales":"{\"user\":1,\"num\":2}", "parsed":{"sales.user":1, "sales.num":2}}

Not to parse times (reserve that field like 'time' in record), specify time_parse no:

<filter raw.sales.*>
  @type parser
  format json
  key_name sales
  hash_value_field parsed
  time_parse no
</filter>
# input string of 'sales': {"user":1,"num":2,"time":"2013-10-31 12:48:33"}
# output data: {"parsed":{"user":1, "num":2,"time":"2013-10-31 12:48:33"}}

TODO

  • consider what to do next
  • patches welcome!

Copyright

  • Copyright
    • Copyright (c) 2016 Jerry Zhou
  • License
    • Apache License, Version 2.0

fluent-plugin-multiline-parser's People

Stargazers

 avatar

Watchers

 avatar  avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.