Comments (10)
My notes:
- The function that counts objects does not return anything effectively stopping streaming. I don't think it is important for that specific case, but can be a source of errors.
end
works only if there isdata
handler. It switches a stream into "flowing mode": https://nodejs.org/api/stream.html#stream_event_endfinish
is what you probably want: https://nodejs.org/api/stream.html#stream_event_finish
Below is the code I used to test it:
const fs = require("fs");
const { chain } = require("stream-chain");
const { parser } = require("stream-csv-as-json");
const { asObjects } = require("stream-csv-as-json/AsObjects");
const { streamValues } = require("stream-json/streamers/StreamValues");
let objectCount = 0;
const pipeline = chain([
fs.createReadStream("./sample.csv"),
parser(),
asObjects(),
streamValues(),
(data) => {
objectCount++;
console.log(data);
if (objectCount % 100 === 0) console.log(objectCount);
return data; // I added this line to continue streaming
},
]);
pipeline.on("error", (err) => {
console.error("pipeline error", err);
});
// The next event handler enables 'end' as well.
// pipeline.on("data", (data) => {
// console.log("data", data);
// });
pipeline.on("end", () => {
console.warn("pipeline end", objectCount);
});
pipeline.on("finish", () => {
console.warn("pipeline finish", objectCount);
});
My sample file (sample.csv
):
a,b,c
1,2,3
4,5,6
package.json
:
{
"name": "temp",
"version": "1.0.0",
"description": "",
"main": "index.js",
"scripts": {
"test": "echo \"Error: no test specified\" && exit 1"
},
"keywords": [],
"author": "",
"license": "ISC",
"dependencies": {
"stream-chain": "^2.2.1",
"stream-csv-as-json": "^1.0.2",
"stream-json": "^1.5.0"
}
}
from stream-csv-as-json.
I originally had return data
but it got stuck/paused at item 32.
I will investigate the use of the finish event. However, one thing I did not note was that the stream was a zip file entry from yauzl.
from stream-csv-as-json.
"finish" is not being called either in my case. Why does your code not show "pipeline end"?
I'm working on an example
from stream-csv-as-json.
Change sample.csv to be this:
a,b,c
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
4,5,6
1,2,3
6,6,6
from stream-csv-as-json.
It has over 32 lines of data. The last one is recognizable.
Output:
node index.js
{ key: 0, value: { a: '1', b: '2', c: '3' } }
{ key: 1, value: { a: '4', b: '5', c: '6' } }
{ key: 2, value: { a: '1', b: '2', c: '3' } }
{ key: 3, value: { a: '4', b: '5', c: '6' } }
{ key: 4, value: { a: '1', b: '2', c: '3' } }
{ key: 5, value: { a: '4', b: '5', c: '6' } }
{ key: 6, value: { a: '1', b: '2', c: '3' } }
{ key: 7, value: { a: '4', b: '5', c: '6' } }
{ key: 8, value: { a: '1', b: '2', c: '3' } }
{ key: 9, value: { a: '4', b: '5', c: '6' } }
{ key: 10, value: { a: '1', b: '2', c: '3' } }
{ key: 11, value: { a: '4', b: '5', c: '6' } }
{ key: 12, value: { a: '1', b: '2', c: '3' } }
{ key: 13, value: { a: '4', b: '5', c: '6' } }
{ key: 14, value: { a: '1', b: '2', c: '3' } }
{ key: 15, value: { a: '4', b: '5', c: '6' } }
{ key: 16, value: { a: '1', b: '2', c: '3' } }
{ key: 17, value: { a: '4', b: '5', c: '6' } }
{ key: 18, value: { a: '1', b: '2', c: '3' } }
{ key: 19, value: { a: '4', b: '5', c: '6' } }
{ key: 20, value: { a: '1', b: '2', c: '3' } }
{ key: 21, value: { a: '4', b: '5', c: '6' } }
{ key: 22, value: { a: '1', b: '2', c: '3' } }
{ key: 23, value: { a: '4', b: '5', c: '6' } }
{ key: 24, value: { a: '1', b: '2', c: '3' } }
{ key: 25, value: { a: '4', b: '5', c: '6' } }
{ key: 26, value: { a: '1', b: '2', c: '3' } }
{ key: 27, value: { a: '4', b: '5', c: '6' } }
{ key: 28, value: { a: '1', b: '2', c: '3' } }
{ key: 29, value: { a: '4', b: '5', c: '6' } }
{ key: 30, value: { a: '1', b: '2', c: '3' } }
{ key: 31, value: { a: '4', b: '5', c: '6' } }
pipeline finish 32
from stream-csv-as-json.
removing the return data;
"fixes" it to return 44 rows ending in 6,6,6. Still no "end" event.
I don't see these behaviors on the stream-json
from stream-csv-as-json.
Take my original code above with your 44 rows CSV file (or any other), uncomment the data
event handler and you will get all 44 rows.
I already gave this link before: https://nodejs.org/api/stream.html#stream_event_end — it explains why end
is not issued. This is not something specific to stream-*
, it is how streams were designed to behave (don't ask me why). The "flowing mode" is explained on the same page, e.g., https://nodejs.org/api/stream.html#stream_two_reading_modes
After reading it, please take my original code above (with data
commented out) and add these three lines at the end:
console.log("readable flowing", pipeline.readableFlowing);
pipeline.resume();
console.log("readable flowing", pipeline.readableFlowing);
This is what I see:
readable flowing null
readable flowing true
{ key: 0, value: { a: '1', b: '2', c: '3' } }
...omitted for brevity...
{ key: 43, value: { a: '6', b: '6', c: '6' } }
pipeline finish 44
pipeline end 44
You can read all about readableFlowing
and it's possible values and what they mean using the link above. I hope it helps!
PS: Incomplete reads you saw before is the stream machinery filling in its buffers anticipating future requests. Obviously it doesn't read everything.
from stream-csv-as-json.
Just to be clear, and easy for you to test:
This is the result:
> node index.js
pipeline finish 32
For reference:
index.js:
const fs = require("fs");
const { chain } = require("stream-chain");
const { parser } = require("stream-csv-as-json");
const { asObjects } = require("stream-csv-as-json/AsObjects");
const { streamValues } = require("stream-json/streamers/StreamValues");
let objectCount = 0;
const pipeline = chain([
fs.createReadStream("./sample.csv"),
parser(),
asObjects(),
streamValues(),
(data) => {
objectCount++;
return data;
},
]);
pipeline.on("error", (err) => {
console.error("pipeline error", err);
});
pipeline.on("end", () => {
console.warn("pipeline end", objectCount);
});
pipeline.on("finish", () => {
console.warn("pipeline finish", objectCount);
});
from stream-csv-as-json.
So either of these work (just talking about finish, not end):
pipeline.on("data", function(){})
- removing
returning data
in my callback pipeline.resume();
Anyhow, thank you for the references. I think my code for csv is slightly different than all my other code as it is not using pipe, and the documentation you referred above mentions changes reading mode. Also worries me that the code of ours has other bugs...
I am debugging someone else's code not in my area of expertise, but now I am getting there!
Thank you!
from stream-csv-as-json.
removing returning data in my callback
It works only because a pipe is broken and produces no values — the stream machinery tries to fill internal buffers and none are filled. Well, some stuff works because it is documented to work this way, some stuff works because of obscure side-effects — what to use is a judgement call.
I am glad that you were able to go forward with you project.
from stream-csv-as-json.
Related Issues (2)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from stream-csv-as-json.