Comments (16)
I see there is this line in the header:
char buf[8192 * 2]; /* read buffer: 16Kb max */
Would changing that help me?
from fluent-bit.
The stdin buffer size is fixed, the workaround would be to increase that.
Anyways I will add more flexible support for that in the next version. Btw,
what kind of data does your file contains?
On Jul 25, 2016 9:19 AM, "Mark Theunissen" [email protected] wrote:
I see there is this line in the header:
char buf[8192 * 2]; /* read buffer: 16Kb max */
Would changing that help me?
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#89 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAWkNs0N2SfiCtrqT_RSsW9hy7_jxP7wks5qZNPhgaJpZM4JUObB
.
from fluent-bit.
Thanks!
The file is JSON data of events. I'm forwarding them to FluentD running on a cloud VM.
So all I need to do is recompile with a larger buffer size on that exact line in the header, and it will solve the issue? How big can I reasonably go? 100 megs? :)
from fluent-bit.
Increasing the buffer would help a bit but it's not a real solution, let me
get back to you this week with some improvement on that area.
On Jul 25, 2016 10:07 AM, "Mark Theunissen" [email protected]
wrote:
Thanks!
The file is JSON data of events. I'm forwarding them to FluentD running on
a cloud VM.So all I need to do is recompile with a larger buffer size on that exact
line in the header, and it will solve the issue? How big can I reasonably
go? 100 megs? :)—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
#89 (comment),
or mute the thread
https://github.com/notifications/unsubscribe-auth/AAWkNh5yAxkWiQ_6trBVZqpP0GQ5xM_5ks5qZN9RgaJpZM4JUObB
.
from fluent-bit.
The problem is processing of 'incomplete JSON data'
fluent-bit reported
[ warn] STDIN data incomplete, waiting for more data...
In this case, fluent-bit was waiting following JSON data.
However, ctx->buf
was full and ctx->buf_len
was also maximum.
Then fluent-bit repeated to read(2) 0bytes. ( because sizeof(ctx->buf)
and ctx->buf_len
are 16384.)
So, fluent-bit can't get following JSON data.
bytes = read(ctx->fd,
ctx->buf + ctx->buf_len,
sizeof(ctx->buf) - ctx->buf_len);
from fluent-bit.
To fix it, we need to know position which is end of JSON data.
for example, fluent-bit read such data.
{"foo":"bar"}{"foo":"bar"}{"foo":"bar"}...{"foo":"bar"}{"foo":"bar"}{"hoge":
fluent-bit should pack
{"foo":"bar"}{"foo":"bar"}{"foo":"bar"}...{"foo":"bar"}{"foo":"bar"}
and
should set ctx->buf
to the rest of JSON {"hoge":
and continue to read following data.
from fluent-bit.
What about using a newline delimiter? That's how my data is structured:
{"foo":"bar"}\n
{"foo":"bar"}\n
{"foo":"bar"}\n
from fluent-bit.
@marktheunissen
I also tested with a data using \n delimiter. (Oops I forgot to write a delimiter above comment.)
Could you test this script for workaround ?
#!/bin/sh
while read line
do
echo ${line}
usleep 30000
done < $1
How to use.
$ chmod +x workaround.sh
$ workaround.sh event.log | td-agent-bit -c td-agent-bit.conf
from fluent-bit.
@marktheunissen can you upload an example file with the JSON samples ? (100MB ideal)
from fluent-bit.
I have pushed a new branch called stdin_buffer, this one adds the following features:
- support concatenated JSON maps
- incoming buffer now is dynamic, by default is set to 32KB
- new configuration property buffer_size to adjust the incoming buffer. The value must answer the following question: does each JSON map fit in 32KB ?, if not, use a different value.
Please get a copy of that branch, build it and give it a try with:
$ cat test.json | fluent-bit -i stdin -t test -o stdout -m test
please test and let me know how it works.
from fluent-bit.
Cool, I tested and it works, thanks!
cat events.log | /usr/bin/stdbuf -oL ~/src/fluent-bit/build/bin/fluent-bit -i stdin -o stdout -f 1 > outtest
now cat outtest | wc -l
gives the right number of lines 👍
from fluent-bit.
I have noticed another issue, but I'll open a new ticket.
from fluent-bit.
thanks, I will merge this new feature in 0.8 branch.
from fluent-bit.
@edsiper
I also tested stdin_buffer, then segfault occurred.
ctx->buf_len
was increased more than ctx->buf_size
Could you check it ?
Should we also reset ctx->buf_len
in case of ret == FLB_ERR_JSON_INVAL
?
I generated JSON data using this script.
#!/bin/sh
for i in `seq 0 10000`
do
echo "{\"foo\":\"bar\"}"
done
$ sh output.sh > data.txt
$ cat data.txt | bin/fluent-bit -i stdin -o stdout -vv
result
$ cat data.txt |bin/fluent-bit -i stdin -o stdout -f 1 -vv
Fluent-Bit v0.8.4
Copyright (C) Treasure Data
[2016/07/30 15:33:57] [ info] starting engine
[2016/07/30 15:33:57] [debug] [stdin] buffer_size=32768 bytes
[2016/07/30 15:33:57] [debug] [router] default match rule stdin.0:stdout.0
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
[2016/07/30 15:33:57] [debug] [in_serial] invalid JSON message, skipping
Segmentation fault
from fluent-bit.
thanks for catching that problem. I fixed the INVAL reset stuff here ad18723
note: I merged all changes in 0.8 branch (stdin_buffer should not longer be used).
from fluent-bit.
@edsiper Thanks. I will check it.
from fluent-bit.
Related Issues (20)
- help with utf8_copyright files HOT 2
- Duplicate events get ingested in winevtlog input plugin for fluent-bit 3.0.2. HOT 3
- Authorization: ApiKey support (for HTTP and ES output plugins) HOT 3
- Needing more info about tail plugin HOT 4
- Fluent Bit 400 Bad Request when integrating with OpenSearch on EKS cluster
- Handle Fluent Bit logs like metrics with its own input plugin
- Porting fluent-bit on QNX OS
- in_http: HTTP/1.1 keep-alive missing when HTTP/2 is enabled
- Kubernetes pod name is incorrect HOT 1
- Managed identity for azure kusto output plugin HOT 3
- Will fluentbit's http server support HTTPs? HOT 2
- es: Logstash_Prefix_Key with static prefix HOT 2
- Sticky logs in the same output message
- Add support for Ubuntu 24.04 HOT 1
- Impossible to ingest timestamp from nested json
- fluent-bit v3 occassionally crashes with SIGSEGV HOT 1
- Logs rotation HOT 6
- fluent-bit 3.0.3 upgrade has broken splunk output plugin when event_sourcetype_key is specified HOT 4
- $dd_tags not interpreted by datadog properly HOT 1
- Warning Log for Threshold size in winevtlog input plugin
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from fluent-bit.