Giter Club home page Giter Club logo

Comments (16)

marktheunissen avatar marktheunissen commented on May 13, 2024

I see there is this line in the header:

char buf[8192 * 2];               /* read buffer: 16Kb max */

Would changing that help me?

from fluent-bit.

edsiper avatar edsiper commented on May 13, 2024

The stdin buffer size is fixed, the workaround would be to increase that.

Anyways I will add more flexible support for that in the next version. Btw,
what kind of data does your file contains?

On Jul 25, 2016 9:19 AM, "Mark Theunissen" [email protected] wrote:

I see there is this line in the header:

char buf[8192 * 2]; /* read buffer: 16Kb max */

Would changing that help me?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#89 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAWkNs0N2SfiCtrqT_RSsW9hy7_jxP7wks5qZNPhgaJpZM4JUObB
.

from fluent-bit.

marktheunissen avatar marktheunissen commented on May 13, 2024

Thanks!

The file is JSON data of events. I'm forwarding them to FluentD running on a cloud VM.

So all I need to do is recompile with a larger buffer size on that exact line in the header, and it will solve the issue? How big can I reasonably go? 100 megs? :)

from fluent-bit.

edsiper avatar edsiper commented on May 13, 2024

Increasing the buffer would help a bit but it's not a real solution, let me
get back to you this week with some improvement on that area.

On Jul 25, 2016 10:07 AM, "Mark Theunissen" [email protected]
wrote:

Thanks!

The file is JSON data of events. I'm forwarding them to FluentD running on
a cloud VM.

So all I need to do is recompile with a larger buffer size on that exact
line in the header, and it will solve the issue? How big can I reasonably
go? 100 megs? :)


You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#89 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAWkNh5yAxkWiQ_6trBVZqpP0GQ5xM_5ks5qZN9RgaJpZM4JUObB
.

from fluent-bit.

nokute78 avatar nokute78 commented on May 13, 2024

The problem is processing of 'incomplete JSON data'

fluent-bit reported
[ warn] STDIN data incomplete, waiting for more data...

In this case, fluent-bit was waiting following JSON data.
However, ctx->buf was full and ctx->buf_len was also maximum.

Then fluent-bit repeated to read(2) 0bytes. ( because sizeof(ctx->buf) and ctx->buf_len are 16384.)
So, fluent-bit can't get following JSON data.

    bytes = read(ctx->fd,
                 ctx->buf + ctx->buf_len,
                 sizeof(ctx->buf) - ctx->buf_len);

from fluent-bit.

nokute78 avatar nokute78 commented on May 13, 2024

To fix it, we need to know position which is end of JSON data.

for example, fluent-bit read such data.
{"foo":"bar"}{"foo":"bar"}{"foo":"bar"}...{"foo":"bar"}{"foo":"bar"}{"hoge":

fluent-bit should pack
{"foo":"bar"}{"foo":"bar"}{"foo":"bar"}...{"foo":"bar"}{"foo":"bar"}

and
should set ctx->buf to the rest of JSON {"hoge": and continue to read following data.

from fluent-bit.

marktheunissen avatar marktheunissen commented on May 13, 2024

What about using a newline delimiter? That's how my data is structured:

{"foo":"bar"}\n
{"foo":"bar"}\n
{"foo":"bar"}\n

from fluent-bit.

nokute78 avatar nokute78 commented on May 13, 2024

@marktheunissen
I also tested with a data using \n delimiter. (Oops I forgot to write a delimiter above comment.)

Could you test this script for workaround ?

#!/bin/sh

while read line 
do
  echo ${line}
  usleep 30000
done < $1

How to use.

$ chmod +x workaround.sh
$ workaround.sh event.log | td-agent-bit -c td-agent-bit.conf

from fluent-bit.

edsiper avatar edsiper commented on May 13, 2024

@marktheunissen can you upload an example file with the JSON samples ? (100MB ideal)

from fluent-bit.

edsiper avatar edsiper commented on May 13, 2024

I have pushed a new branch called stdin_buffer, this one adds the following features:

  • support concatenated JSON maps
  • incoming buffer now is dynamic, by default is set to 32KB
  • new configuration property buffer_size to adjust the incoming buffer. The value must answer the following question: does each JSON map fit in 32KB ?, if not, use a different value.

Please get a copy of that branch, build it and give it a try with:

$ cat test.json | fluent-bit -i stdin -t test -o stdout -m test 

please test and let me know how it works.

from fluent-bit.

marktheunissen avatar marktheunissen commented on May 13, 2024

Cool, I tested and it works, thanks!

cat events.log | /usr/bin/stdbuf -oL ~/src/fluent-bit/build/bin/fluent-bit -i stdin -o stdout -f 1 > outtest

now cat outtest | wc -l gives the right number of lines 👍

from fluent-bit.

marktheunissen avatar marktheunissen commented on May 13, 2024

I have noticed another issue, but I'll open a new ticket.

from fluent-bit.

edsiper avatar edsiper commented on May 13, 2024

thanks, I will merge this new feature in 0.8 branch.

from fluent-bit.

nokute78 avatar nokute78 commented on May 13, 2024

@edsiper
I also tested stdin_buffer, then segfault occurred.
ctx->buf_len was increased more than ctx->buf_size
Could you check it ?
Should we also reset ctx->buf_len in case of ret == FLB_ERR_JSON_INVAL ?

I generated JSON data using this script.

#!/bin/sh

for i in `seq 0 10000`
do
    echo "{\"foo\":\"bar\"}"
done
$ sh output.sh > data.txt
$ cat data.txt | bin/fluent-bit -i stdin -o stdout  -vv

result

$ cat data.txt |bin/fluent-bit -i stdin -o stdout -f 1 -vv
Fluent-Bit v0.8.4
Copyright (C) Treasure Data

[2016/07/30 15:33:57] [ info] starting engine
[2016/07/30 15:33:57] [debug] [stdin] buffer_size=32768 bytes
[2016/07/30 15:33:57] [debug] [router] default match rule stdin.0:stdout.0
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
Segmentation fault

from fluent-bit.

edsiper avatar edsiper commented on May 13, 2024

thanks for catching that problem. I fixed the INVAL reset stuff here ad18723

note: I merged all changes in 0.8 branch (stdin_buffer should not longer be used).

from fluent-bit.

nokute78 avatar nokute78 commented on May 13, 2024

@edsiper Thanks. I will check it.

from fluent-bit.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.