Giter Club home page Giter Club logo

fluent-plugin-papertrail's People

Contributors

alexouzounis avatar cce avatar cmrust avatar farzadb82 avatar kyleroot avatar leecalcote avatar trevrosen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fluent-plugin-papertrail's Issues

RuntimeError error

Hello,

I'm new to paper trail, I keep getting the following error

2022-03-07 13:45:45 +0000 [warn]: #0 dump an error event: error_class=RuntimeError error="failed to expand %Q[\#{record[\"kubernetes\"][\"namespace_name\"]}-\#{record[\"kubernetes\"][\"pod_name\"]}] : error = undefined method []' for nil:NilClass" location="/var/lib/gems/2.3.0/gems/fluentd-1.2.6/lib/fluent/plugin/filter_record_transformer.rb:310:in rescue in expand'" tag="kubernetes.1" time=2022-03-07 13:45:44.206022446 +0000 record={"message"=>"2022-01-11T13:23:33.587409310+00:00 stderr F time="2022-01-11T13:23:33Z" level=debug msg="proxied connection finished""}

fluent.conf
<source>
@type tail
path /var/log/containers/*.log
pos_file fluentd-containers.log.pos
format none
read_from_head true
tag kubernetes.1
</source>

<filter kubernetes.**>
type kubernetes_metadata
</filter>

<filter kubernetes.**>
type record_transformer
enable_ruby true
<record>
hostname ${record["kubernetes"]["namespace_name"]}-${record["kubernetes"]["pod_name"]}
program ${record["kubernetes"]["container_name"]}
message ${record['log']}
</record>
</filter>

<match **>
@type papertrail
papertrail_host "#{ENV['FLUENT_PAPERTRAIL_HOST']}"
papertrail_port "#{ENV['FLUENT_PAPERTRAIL_PORT']}"
num_threads 4
buffer_type file
buffer_path /var/log/fluentd-buffer-papertrail
</match>

Any help is appreciated.

Randomly stops sending log to PaperTrail endpoint

I can't figure out from the error message we get randomly, what is the real problem, but the plugin stops sending logs:

2018-10-17 12:55:14 +0000 [warn]: failed to flush the buffer. retry_time=0 next_retry_seconds=2018-10-17 12:55:15 +0000 chunk="5786c2e64b0dca0038a4b247899ff483" error_class=ArgumentError error="'I' is not a designated severity"
  2018-10-17 12:55:14 +0000 [warn]: /var/lib/gems/2.3.0/gems/syslog_protocol-0.9.2/lib/syslog_protocol/packet.rb:72:in `severity='
  2018-10-17 12:55:14 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluent-plugin-papertrail-0.2.6/lib/fluent/plugin/out_papertrail.rb:82:in `create_packet'
  2018-10-17 12:55:14 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluent-plugin-papertrail-0.2.6/lib/fluent/plugin/out_papertrail.rb:44:in `block in write'
  2018-10-17 12:55:14 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/event.rb:323:in `each'
  2018-10-17 12:55:14 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/event.rb:323:in `block in each'
  2018-10-17 12:55:14 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/plugin/buffer/memory_chunk.rb:80:in `open'
  2018-10-17 12:55:14 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/plugin/buffer/memory_chunk.rb:80:in `open'
  2018-10-17 12:55:14 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/event.rb:322:in `each'
  2018-10-17 12:55:14 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluent-plugin-papertrail-0.2.6/lib/fluent/plugin/out_papertrail.rb:39:in `write'
  2018-10-17 12:55:14 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/compat/output.rb:131:in `write'
  2018-10-17 12:55:14 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/plugin/output.rb:1094:in `try_flush'
  2018-10-17 12:55:14 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/plugin/output.rb:1319:in `flush_thread_run'
  2018-10-17 12:55:14 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/plugin/output.rb:439:in `block (2 levels) in start'
  2018-10-17 12:55:14 +0000 [warn]: /var/lib/gems/2.3.0/gems/fluentd-1.1.0/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2018-10-17 12:55:15 +0000 [warn]: failed to flush the buffer. retry_time=0 next_retry_seconds=2018-10-17 12:55:15 +0000 chunk="5786c2e64b0dca0038a4b247899ff483" error_class=ArgumentError error="'I' is not a designated severity"
  2018-10-17 12:55:15 +0000 [warn]: suppressed same stacktrace
2018-10-17 12:55:16 +0000 [warn]: failed to flush the buffer. retry_time=1 next_retry_seconds=2018-10-17 12:55:16 +0000 chunk="5786c2e64b0dca0038a4b247899ff483" error_class=ArgumentError error="'I' is not a designated severity"
  2018-10-17 12:55:16 +0000 [warn]: suppressed same stacktrace
2018-10-17 12:55:18 +0000 [warn]: failed to flush the buffer. retry_time=2 next_retry_seconds=2018-10-17 12:55:18 +0000 chunk="5786c2e64b0dca0038a4b247899ff483" error_class=ArgumentError error="'I' is not a designated severity"
  2018-10-17 12:55:18 +0000 [warn]: suppressed same stacktrace
2018-10-17 12:55:22 +0000 [warn]: failed to flush the buffer. retry_time=3 next_retry_seconds=2018-10-17 12:55:22 +0000 chunk="5786c2e64b0dca0038a4b247899ff483" error_class=ArgumentError error="'I' is not a designated severity"
  2018-10-17 12:55:22 +0000 [warn]: suppressed same stacktrace
2018-10-17 12:55:29 +0000 [warn]: failed to flush the buffer. retry_time=4 next_retry_seconds=2018-10-17 12:55:29 +0000 chunk="5786c2e64b0dca0038a4b247899ff483" error_class=ArgumentError error="'I' is not a designated severity"```

Systemd::JournalError: No such file or directory

Im continually getting this warning when running this container in kubernetes.

I've searched and found the:
env:
- name: FLUENTD_SYSTEMD_CONF
value: "disable"

which I've added under the containers section but its not making a difference.

Any idea how to remove those warnings?

Override invalid severity values rather than failing

We have a user submitted issue where an invalid severity value was bringing the plugin to a halt: #25

We should add an option to override invalid severity values with info. Does this apply to other values, like facility, as well?

The above referenced issue has links to implementation details for the existing process.

Plugin does not handle idle disconnects in a timely fashion. Results in dropped logs

The Papertrail documentation says idle connection timeouts should be handled by the client, ie this plugin (emphasis mine):

Papertrail automatically closes idle TCP connections after 15 minutes of inactivity. Log senders should already handle this as part of a full implementation, since connections are routinely affected by Internet reachability changes. Senders can elect to immediately reconnect, or leave the connection closed until new log messages are available.

This plugin does retry, but the behavior is still rather impactful for delivery:

  • plugin makes a socket connection to papertrail
  • some time passes without logging, and papertrail closes the connection due to the idle timeout
  • logs are generated, and the plugin attempts to write to the socket
  • after 10+ minutes, (possibly only triggered by continued writes) the socket write times out
  • a new socket is created, and the chunk is "retried successfully", but log messages are missing on Papertrail

When I started debugging this, I thought it was just delayed messages, but I'm also finding that some messages are dropped entirely during this process. Is it possible to handle this more gracefully, like with a tcp keepalive?

This syslog output plugin seems to do so (the Socket changes are here)


Steps to reproduce:

  1. configure fluentd to capture logs and send them via this plugin to papertrail
  2. Log infrequently, eg:
root@89f331735970:/app# date; sleep 3600; date
Tue Feb 14 16:39:04 UTC 2023
Tue Feb 14 17:39:04 UTC 2023
root@89f331735970:/app# while true; do date; sleep 10; done
Tue Feb 14 17:55:03 UTC 2023
Tue Feb 14 17:55:13 UTC 2023
Tue Feb 14 17:55:23 UTC 2023
Tue Feb 14 17:55:33 UTC 2023
Tue Feb 14 17:55:43 UTC 2023
  1. See that the log lines for "17:39:04" "17:55:03" and "17:55:13" are dropped and not received in papertrail
    Screen Shot 2023-02-14 at 10 58 38

  2. Confirm in the fluentd logs that there was a socket write timeout error

2023-02-14 16:14:50 +0000 [info]: #0 initializing tcp socket for logs3.papertrailapp.com:redacted_port
2023-02-14 17:55:24 +0000 [warn]: #0 failed to flush the buffer. retry_times=0 next_retry_time=2023-02-14 17:55:25 +0000 chunk="5f4acaafd238fdba7d32b4b3b001c59f" error_class=Fluent::Papertrail::SocketFailureError error="Error writing to logs3.papertrailapp.com:redacted_port: Operation timed out"
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/2.7.0/openssl/buffering.rb:322:in `syswrite'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/2.7.0/openssl/buffering.rb:322:in `do_write'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/2.7.0/openssl/buffering.rb:409:in `puts'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluent-plugin-papertrail-0.2.8/lib/fluent/plugin/out_papertrail.rb:133:in `send_to_papertrail'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluent-plugin-papertrail-0.2.8/lib/fluent/plugin/out_papertrail.rb:45:in `block in write'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/event.rb:315:in `each'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/event.rb:315:in `block in each'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin/buffer/memory_chunk.rb:81:in `open'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin/buffer/memory_chunk.rb:81:in `open'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/event.rb:314:in `each'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluent-plugin-papertrail-0.2.8/lib/fluent/plugin/out_papertrail.rb:39:in `write'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/compat/output.rb:131:in `write'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin/output.rb:1179:in `try_flush'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin/output.rb:1500:in `flush_thread_run'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin/output.rb:499:in `block (2 levels) in start'
  2023-02-14 17:55:24 +0000 [warn]: #0 /usr/lib/ruby/gems/2.7.0/gems/fluentd-1.14.6/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
 2023-02-14 17:55:25 +0000 [info]: #0 initializing tcp socket for logs3.papertrailapp.com:redacted
2023-02-14 17:55:25 +0000 [warn]: #0 retry succeeded. chunk_id="5f4acaafd238fdba7d32b4b3b001c59f"

Strange record size limits

I just ran this plugin for our setup and found out two undocumented limits:

  1. Tag size is limited to 32 symbols - if tag is longer (which is true for us) then the plugin will throw an error;
  2. It seems to me that the message field is truncated to 969 symbols.
    Is that a limitation or am I missing something?

Logging stalls due to unexpected records

This works great as long as each record includes all five required fields. But as soon as one record is missing something, fluentd enters an error loop and stops forwarding logs:

[warn]: temporarily failed to flush the buffer. error="Hostname may not be omitted"

The most foolproof fix would be to modify the out_papertrail plugin to fill in reasonable defaults for each field. Another option would be to modify the fluentd configuration to collect fewer files or transform everything to ensure the five fields are present (like this).

Plugin stops sending logs to endpoint after Fluent::Papertrail::SocketFailureError

Hi folks - first off, thanks for maintaining this plugin! ๐Ÿ‘

Occasionally (every few days or weeks), our Papertrail pods, running in Kubernetes as a DaemonSet, stop logging after a connection reset.

It appears that the plugin itself recovers from the error and continues logging, but the exception is not logged to Papertrail and all subsequent logs are missing from Papertrail. Interestingly, if we do kill the misbehaving pod, the graceful shutdown causes it to flush its buffer and we see all of the logs it was keeping to itself.

Here is an example with the stacktrace (PT endpoints redacted to logs.papertrailapp.com:12345):

2019-01-25 03:35:59 +0000 [info]: #0 [filter_kube_metadata] stats - namespace_cache_size: 1, pod_cache_size: 4, namespace_cache_api_updates: 1256, pod_cache_api_updates: 1259, id_cache_miss: 1256
2019-01-25 03:36:29 +0000 [info]: #0 [filter_kube_metadata] stats - namespace_cache_size: 1, pod_cache_size: 4, namespace_cache_api_updates: 1256, pod_cache_api_updates: 1259, id_cache_miss: 1256
2019-01-25 03:36:59 +0000 [info]: #0 [filter_kube_metadata] stats - namespace_cache_size: 1, pod_cache_size: 4, namespace_cache_api_updates: 1256, pod_cache_api_updates: 1259, id_cache_miss: 1256
2019-01-25 03:37:27 +0000 [warn]: #0 [out_papertrail] failed to flush the buffer. retry_time=0 next_retry_seconds=2019-01-25 03:37:28 +0000 chunk="580400bbe313472fef6c2f9cc64d6193" error_class=Fluent::Papertrai
l::SocketFailureError error="Error writing to logs.papertrailapp.com:12345: Connection reset by peer"
  2019-01-25 03:37:27 +0000 [warn]: #0 /usr/lib/ruby/2.3.0/openssl/buffering.rb:322:in `syswrite'
  2019-01-25 03:37:27 +0000 [warn]: #0 /usr/lib/ruby/2.3.0/openssl/buffering.rb:322:in `do_write'
  2019-01-25 03:37:27 +0000 [warn]: #0 /usr/lib/ruby/2.3.0/openssl/buffering.rb:407:in `puts'
  2019-01-25 03:37:27 +0000 [warn]: #0 /fluentd/vendor/bundle/ruby/2.3.0/gems/fluent-plugin-papertrail-0.2.6/lib/fluent/plugin/out_papertrail.rb:119:in `send_to_papertrail'
  2019-01-25 03:37:27 +0000 [warn]: #0 /fluentd/vendor/bundle/ruby/2.3.0/gems/fluent-plugin-papertrail-0.2.6/lib/fluent/plugin/out_papertrail.rb:45:in `block in write'
  2019-01-25 03:37:27 +0000 [warn]: #0 /fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.3.2/lib/fluent/event.rb:323:in `each'
  2019-01-25 03:37:27 +0000 [warn]: #0 /fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.3.2/lib/fluent/event.rb:323:in `block in each'
  2019-01-25 03:37:27 +0000 [warn]: #0 /fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.3.2/lib/fluent/plugin/buffer/file_chunk.rb:172:in `open'
  2019-01-25 03:37:27 +0000 [warn]: #0 /fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.3.2/lib/fluent/event.rb:322:in `each'
  2019-01-25 03:37:27 +0000 [warn]: #0 /fluentd/vendor/bundle/ruby/2.3.0/gems/fluent-plugin-papertrail-0.2.6/lib/fluent/plugin/out_papertrail.rb:39:in `write'
  2019-01-25 03:37:27 +0000 [warn]: #0 /fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.3.2/lib/fluent/compat/output.rb:131:in `write'
  2019-01-25 03:37:27 +0000 [warn]: #0 /fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.3.2/lib/fluent/plugin/output.rb:1123:in `try_flush'
  2019-01-25 03:37:27 +0000 [warn]: #0 /fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.3.2/lib/fluent/plugin/output.rb:1423:in `flush_thread_run'
  2019-01-25 03:37:27 +0000 [warn]: #0 /fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.3.2/lib/fluent/plugin/output.rb:452:in `block (2 levels) in start'
  2019-01-25 03:37:27 +0000 [warn]: #0 /fluentd/vendor/bundle/ruby/2.3.0/gems/fluentd-1.3.2/lib/fluent/plugin_helper/thread.rb:78:in `block in thread_create'
2019-01-25 03:37:28 +0000 [info]: #0 [out_papertrail] initializing tcp socket for logs.papertrailapp.com:12345
2019-01-25 03:37:29 +0000 [info]: #0 [filter_kube_metadata] stats - namespace_cache_size: 1, pod_cache_size: 4, namespace_cache_api_updates: 1256, pod_cache_api_updates: 1259, id_cache_miss: 1256
2019-01-25 03:37:59 +0000 [info]: #0 [filter_kube_metadata] stats - namespace_cache_size: 1, pod_cache_size: 4, namespace_cache_api_updates: 1256, pod_cache_api_updates: 1259, id_cache_miss: 1256
2019-01-25 03:38:29 +0000 [info]: #0 [filter_kube_metadata] stats - namespace_cache_size: 1, pod_cache_size: 3, namespace_cache_api_updates: 1256, pod_cache_api_updates: 1259, id_cache_miss: 1256

You can see the reinitialization of the TCP socket and then logs continue normally. However, we are missing (in the PT interface) a line or two before the stacktrace, the stacktrace itself, and then everything subsequently. The logs above were obtained directly from the pod via kubectl logs after somebody noticed some missing logs and we tracked down the responsible pod.

Our fluent.conf:

    ### Based on https://github.com/fluent/fluentd-kubernetes-daemonset

    @include systemd.conf
    @include kubernetes.conf
    @include conf.d/*.conf

    ## Capture audit logs
    ##<match kube-apiserver-audit>
    ##  @type papertrail
    ##
    ##  papertrail_host "#{ENV['FLUENT_PAPERTRAIL_AUDIT_HOST']}"
    ##  papertrail_port "#{ENV['FLUENT_PAPERTRAIL_AUDIT_PORT']}"
    ##</match>

    <match **>
      @type papertrail
      @id out_papertrail

      papertrail_host "#{ENV['FLUENT_PAPERTRAIL_HOST']}"
      papertrail_port "#{ENV['FLUENT_PAPERTRAIL_PORT']}"

      buffer_type file
      buffer_path /var/log/fluentd-buffer-papertrail
    </match>

All of the other configuration for the plugin is standard, shipped with fluentd-kubernetes-daemonset.

Versions:

Kubernetes: 1.11.6
Fluentd: 1.3.2 (using https://github.com/fluent/fluentd-kubernetes-daemonset)
fluent-plugin-papertrail: 0.2.6

We're open to any suggestions if there is something we can do to work around the issue as well, or something we have misconfigured!

Orphaned Namespace

The logs in our papertrail look like this:

May 03 16:26:50 staging-.orphaned-fluentd-papertrail-global-logging-rqnjj fluentd:   </parse>
May 03 16:26:50 staging-.orphaned-fluentd-papertrail-global-logging-rqnjj fluentd: </filter> is not used.
May 03 16:26:50 staging-.orphaned-fluentd-papertrail-global-logging-rqnjj fluentd: 2018-05-03 06:26:48 +0000 [warn]: parameter 'include_namespace_metadata' in <filter kubernetes.**>
May 03 16:26:50 staging-.orphaned-fluentd-papertrail-global-logging-rqnjj fluentd:   type kubernetes_metadata
May 03 16:26:50 staging-.orphaned-fluentd-papertrail-global-logging-rqnjj fluentd:   include_namespace_metadata true
May 03 16:26:50 staging-.orphaned-fluentd-papertrail-global-logging-rqnjj fluentd:   annotation_match ["solarwinds.io/*"]
May 03 16:26:50 staging-.orphaned-fluentd-papertrail-global-logging-rqnjj fluentd: </filter> is not used.

The namespace created in papertrail is .orphaned.

The file its tailing is:
/var/log/containers/fluentd-papertrail-global-logging-rqnjj_kube-system_fluentd-d2648d90a647ef1b90b4e48514521e26c1322793d99331082a8c3fe02757372d.log.

Docker version:
quay.io/solarwinds/fluentd-kubernetes:v1.1.1-debian-papertrail-0.2.3
Kubernetes Version: 1.10

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.