Giter Club home page Giter Club logo

fast-csv's People

Contributors

abidex4yemi avatar adammichaelwilliams avatar adamvoss avatar bryant1410 avatar caleb-irwin avatar dbbring avatar derjust avatar doug-martin avatar dustinsmith1024 avatar foxmicha avatar jonstacks avatar manoellribeiro avatar matthewhembree avatar mdoelker avatar memg92 avatar micheletriaca avatar neychok avatar olleolleolle avatar outbackstack avatar ovax3 avatar rajit avatar renovate-bot avatar renovate[bot] avatar rishabh-c2fo avatar rubenamaury avatar shane-walker avatar sumitbando avatar technotronicoz avatar xavi- avatar zackerydev avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

fast-csv's Issues

Doc: Example output for writeToString() with transform() is wrong

csv.writeToString(
    [
        {a: "a1", b: "b1"},
        {a: "a2", b: "b2"}
    ],
    {
        headers: true,
        transform: function (row) {
            return {
                A: row.a,
                B: row.b
            };
        }
    },
    function (err, data) {
        console.log(data); //"a,b\na1,b1\na2,b2\n"
    }
);

I believe console line should be:

        console.log(data); //"A,B\na1,b1\na2,b2\n"

Maximum Call Stack Size Exceed When Piping into another stream

It appears that when using fast-csv with the listener 'data' the stream cannot be piped into another stream, otherwise, you get this error:

/Users/bistsls/projects/vlh/blah/node_modules/fast-csv/lib/parser/parser_stream.js:287
    emit: function (event) {
                   ^
RangeError: Maximum call stack size exceeded

Here is the code I am using:

var myCsvStream = fs.createReadStream('csv_files/myCSVFile.csv');
var csv = require('fast-csv');
var myData = [];

var myFuncs = { 
 parseCsvFile: function (filepath) {

  var csvStream;

  csvStream = csv
  .parse({headers: true, objectMode: true, trim: true})
  .on('data', function (data) {

       myData.push(data);

  })
  .on('end', function () {

    console.log('done parsing counties')
  });

  return csvStream;

 }
}

myCsvStream
.pipe(myFuncs.parseCsvFile())
.pipe(process.stdout);

The process.stdout is just so I can see that the data can continue on to the next stream, however, when adding pipe(process.stdout) or even a through2 duplex stream I get this maximum callstack reached error.

Issue is noted in SO post found here:
http://stackoverflow.com/questions/30620756/node-streams-get-maximum-call-stack-exceeded?noredirect=1

Giving objects to writeToString with {headers: false} fails

csv.writeToString(
  [
    {a: "a1", b: "b1"},
    {a: "a2", b: "b2"}
  ], {
      headers: true
  }, function (err, data) {
    console.log('err:', err, 'data: |' + data + '|');
  }
);

gives

err: null data: |a,b
a1,b1
a2,b2|

but

csv.writeToString(
  [
    {a: "a1", b: "b1"},
    {a: "a2", b: "b2"}
  ], {
      headers: false
  }, function (err, data) {
    console.log('err:', err, 'data: |' + data + '|');
  }
);

gives

err: null data: |
|

Intermittently fails for large csv file

Hi,

I have a csv file with 15000 entries (1.3mb) . Csv parsing fails randomly on AWS amazon linux box, but then it passes sometimes cleanly parsing all the rows correctly.There is no issue of memory, the machine instance has 4Gb memory. The error received is:

Parse Error: expected: '\"' got: 'undefined'

Sometimes the parser cannot read the complete line and so it fails for matching ending quote in parser.js:76 parsedEscapedItem function call.
The strange thing is the issue happens intermittently so it can actually read correctly few times.

the only option i pass on to fromStream method are {ignoreEmpty:true} rest are all defaults.

Any pointers would be helpful.
thanks.

Dose comment supported?

Hello:
I'm a new user to fast-csv , and I want to know that:
Dose comment supported? Just as csv:
comment Treat all the characteres after this one as a comment, default to '#'.
I just haven't seen any similar option, thanks.

Slow performance generating CSV from array / object

Hi,

I am trying your library to write CSV data files from data I have in memory and found very slow performance. At first I thought it was a disk acces issue but then I coded a small benchmark separating csv generation from disk writing and found that the main delay is when the csv is generated.

I coded an alternative benchmark just nesting two FORs and appending commas and endlines and it was about 4 times faster.

Mostly I'm interested in knowing if I'm doing something wrong in the way I'm using your lib or if this is just it's normal performance.

You can see my tests here: http://pastebin.com/McwhxY9B , http://pastebin.com/cRKZRqp3 . It just writes 1 million rows with 4 32-bit integer each. Test data generation isn't optimal but it's enough for the purpose of this test, I think.

In my local computer I got about 8 seconds with the lib and 2 seconds with my alternative to generate the million rows. Tried several times and results were consistent. Time ratio (~4 to 1) was also consistent when testing with different number of rows.

error event

Hi,
the documentation states that if fast-csv runs into a parse error the "parse-error" event will be emitted.

Based on the source code (and my tests :) ) the "error" event will be emitted.

Would be nice if either the event could be changed or the documentaion updated

fromStream?

I'm using the fromStream example in the README, and I get this error...

Object # has no method 'fromStream'

`data` or `record` deprecated?

The parser stream code clearly emits record: https://github.com/C2FO/fast-csv/blob/master/lib/parser/parser_stream.js#L192-L197

And the tests clearly all use record, e.g. https://github.com/C2FO/fast-csv/blob/master/test/fast-csv.test.js#L461 (or seem to use some weird mix of data and record)

But the Readme notes that record is deprecated: https://github.com/C2FO/fast-csv/blame/master/README.md#L37

So I'm confused whether the emitRecord function in parser_stream is incorrect or the Readme?

New release including headers fix?

Do you plan on making a new release soon?

0.5.4 on npm does not include the pull request I made that was merged in Dec 2. It would be great if I could depend on a fixed, released version to get a working fast-csv.

Request: add callback to each row item, so that it can wait before sending next row

From the example:

var csvStream = csv()
    .on("data", function(data){
         console.log(data);
    })
    .on("end", function(){
         console.log("done");
    });

stream.pipe(csvStream);

It would be nice to have this as an option:

.on('data', function(data, callback) {} );

This way you could do stuff to each row at a time and then not start processing the next row until that was done.

Example use:

.on( 'data', function(data, callback) {
    getWeatherForDay( data, function( err, temp ) {  
               writeDayAndTemp();
               callback(null); 
    });
});

Thanks for this library it is very useful!

Bug in writer

Trying the example:
csv
.fromPath("my.csv", {headers: true, objectMode: true})
.pipe(csv.createWriteStream({headers: true}))
.pipe(fs.createWriteStream("out.csv", {encoding: "utf8"}));

formatter.js line 88: headers = hash.keys(item);
is expecting an object but item is a string so the script crashes:

TypeError
at Object.keys (/Users/tbentz/code/nodejs/extracts/extract_process/node_modules/fast-csv/node_modules/object-extended/index.js:116:23)
at Readable.writer.write (/Users/tbentz/code/nodejs/extracts/extract_process/node_modules/fast-csv/lib/formatter.js:88:36)

Pause does not work

Why is this important?
MongoDB is in many cases the standard to node when it comes to Databases. However if events are emitted faster than Mongoose can handle them, It can result in duplication and or index errors and possibly more

Attempts at solutions and work arounds

  1. Pausing the Stream Directly inside of the "record" event
stream = fs.createReadStream(filepath);
csvstream = csv.fromStream(stream).on("record", function(data){
stream.pause();
console.log("shouldn't see this more than once");
})

stream.pause() doesn't seem to do anything
2) Going into old mode, then pausing directly

stream = fs.createReadStream(filepath);
stream.pause();
csvstream = csv.fromStream(stream).on("record", function(data){
stream.pause();
console.log("shouldn't see this more than once");
})
stream.resume();

stream.pause() doesn't seem to do anything
3) using pause method of the Parser_Stream
https://github.com/C2FO/fast-csv/blob/master/lib/parser_stream.js#L156

stream = fs.createReadStream(filepath);
csvstream = csv.fromStream(stream).on("record", function(data){
csvstream.pause();
console.log("shouldn't see this more than once");
})

csvstream.pause() doesn't seem to do anything
4) combining both
https://github.com/C2FO/fast-csv/blob/master/lib/parser_stream.js#L156

stream = fs.createReadStream(filepath);
stream.pause();
csvstream = csv.fromStream(stream).on("record", function(data){
stream.pause();
csvstream.pause();
console.log("shouldn't see this more than once");
})
stream.resume();

callback doesn't seem to do anything
5) Using csvstream to resume

stream = fs.createReadStream(filepath);
stream.pause();
csvstream = csv.fromStream(stream).on("record", function(data){
stream.pause();
csvstream.pause();
console.log("shouldn't see this more than once");
})
csvstream.resume();

Process Doesn't start
https://github.com/C2FO/fast-csv/blob/master/lib/parser_stream.js#L163
6) Attempt to see if csvstream.pause() even matters (which it doesn't)

stream = fs.createReadStream(filepath);
stream.pause();
csvstream = csv.fromStream(stream).on("record", function(data){
stream.pause();
console.log("shouldn't see this");
})
csvstream.pause();
stream.resume();

Doesn't work
7) Using from path

csvstream = csv.fromPath(filepath).on("record", function(data){
csvstream.pause();
console.log("shouldn't see this more than once");
})

Doesn't work
8) Checking if from path at least prevents it

csvstream = csv.fromPath(filepath).on("record", function(data){
console.log("shouldn't see this");
})
csvstream.pause();

Doesn't work

Double Quoted Values invalid?

Excel's way to escape double quotes is like so.

"He said ""Hello World"""

Unfortunately, it looks like the regex checking for valid rows does not account for this. I'd take a stab at the regex for this, but I'd definitely screw it up. Thoughts on this? Agree that this is a valid row?

Missing rowDelimiter at end of last line

The last line doesn't get a finishing "\n", which messes up legacy code.

What's the best way to add that? (using a stream with transformation).

Didn't seem to be able to find the right place to a write or a push.

includeEndRowDelimiter ignored on windows

Hi, I'm using the latest version (0.5.1) on a node-webkit project.

The same code is used in windows and linux.

In linux the generated csv when I use "includeEndRowDelimiter" : true, I can see the rowDelimiter carachter at the end of the file, in windows the carachter is missing.

It seems that the _flush method in the formatter_stream.js is not called.

on('end') not being triggered

I am reading a csv file and streaming the data through a mongoose.js model.

myFile.js

var stream = fs.createReadStream("name of file"); fcsv(stream) .on('data', function(data) { ModelName.find(query, function(err, docs) { console.log('docs', docs); }); }) .on('end', function() { console.log('done'); }) .parse();

The script runs and a list of docs is printed out.
But the on('end') is not triggered.

What can I do to call the on('end')?

Error when parsing csv file with double quotes inside (comma as delimiter)

How to handle the exception with the invalid csv file like double quotes in the content?

I want to keep double quotes finally or catching the exception and ignoring the invalid row is acceptable as well, like the following example, I want first column as

blah...Start by "stamping" powder on the outer edges of face and ...blah

For example, the "stamping" in the csv file:

"blah...Start by "stamping" powder on the outer edges of face and ...blah","blah","blah blah","blah blah blah"

Error:

error: Parse Error: expected: '"' got: 's'. at 'stamping"  Error: Parse Error: expected: '"' got: 's'. at 'stamping"
    at parseEscapedItem (/node_modules/fast-csv/lib/parser.js:73:19)
    at ParserStream.parseLine [as parser] (/node_modules/fast-csv/lib/parser.js:142:30)
    at ParserStream._parseLine [as _parse] (/node_modules/fast-csv/lib/parser_stream.js:109:25)
    at ParserStream.extended.extend._transform (/node_modules/fast-csv/lib/parser_stream.js:166:29)
    at ParserStream.Transform._read (_stream_transform.js:179:10)
    at ParserStream.Transform._write (_stream_transform.js:167:12)
    at doWrite (_stream_writable.js:221:10)
    at writeOrBuffer (_stream_writable.js:211:5)
    at ParserStream.Writable.write (_stream_writable.js:180:11)
    at write (_stream_readable.js:583:24) 

the error message follows the above one

verbose: Shutting down Aura...
verbose: Closing Http Server
error: Error: undefined
    at Object.serverErrorOccurred [as 500] (/config/500.js:19:28)
    at ServerResponse.respond500 [as serverError] (/node_modules/aura/lib/aura/http/hooks/request.js:127:23)
    at Domain. (/node_modules/aura/lib/aura/express/load.js:56:17)
    at Domain.EventEmitter.emit (events.js:95:17)
    at ParserStream.EventEmitter.emit (events.js:70:21)
    at spreadArgs (/node_modules/fast-csv/lib/parser_stream.js:20:21)
    at ParserStream.extended.extend.emit (/node_modules/fast-csv/lib/parser_stream.js:217:13)
    at ParserStream.onerror (_stream_readable.js:518:12)
    at ParserStream.EventEmitter.emit (events.js:95:17)
    at spreadArgs (/node_modules/fast-csv/lib/parser_stream.js:20:21)

pipe to a transform stream

I'm a little new to node streams, but I don't see how I can pipe the csvStream to another transform stream, since the data output is the stringified version of the record (not the parsed version).

Stack errors with very large CSV files

I'm getting a RangeError when I read very large - 570k - CSV files.

Reading from myfile.csv

/..../node_modules/fast-csv/lib/parser/parser_stream.js:277
emit: function (event) {
^
RangeError: Maximum call stack size exceeded

I do not have the same problem when I write the file of this size - only when I turn around and read it back.

Here's the pattern I'm using :

var file_name = process.argv[2];
var instream = fs.createReadStream(file_name);
var readStream = csv.parse({headers:true});

readStream.on('data', function(data) {
// transform the data and write it to a new file
});

readStream.on('end', function() {
console.log('\ndone.');
});

instream.pipe(readStream);

end event not triggered

I was running v0.5.4 on node 0.10.32 without issues, but then decided to update to 0.5.6. Now 'record' and 'data-invalid' are triggered as expected, but when the file is done, no 'end' event occurs. I tried this with CSV files from different sources, so I'm sure there's nothing wrong with the input.

I also updated node to 0.12.0, but the problem persists. Rolling back to fast-csv v0.5.5 resolved it tho.

My use case:

var csv = require('fast-csv');
var Promise = require('bluebird');

var promisecsv = Promise.method(function(path, options) {
    return new Promise(function(resolve, reject) {
        var records = [];

        csv
            .fromPath(
                path,
                options
            )
            .validate(function(data) {
                // do stuff here, return true / false
            })
            .on('record', function(record) {
                console.log(records.length);
                records.push(record);
            })
            .on('data-invalid', function() {
                // validation failed, do nothing
            })
            .on('end', function() {
                console.log('parsing done: ' + records.length);
                resolve(records);
            })
        ;
    });
});

module.exports = promisecsv;

Piping output through zlib

I am attempting to output a compressed csv like this:

var csvStream = csv.createWriteStream({headers:true});
var writableStream = fs.createWriteStream("foo.csv.gz");
csvStream.pipe(zlib.createGzip()).pipe(writableStream);
csvStream.write({a:1,b:2});
csvStream.end();

I am finding that the file comes out zero length. If I remove the zlip stream in the middle things work perfectly. Any ideas?

Thanks!

Chris

Typo in README

Seems that there is typo in the documentation, you are calling the method fs.createWritableStream (does't exist on the default fs module) instead of calling fs.createWriteStream

download csv

We are using fast-csv to create a csv report and we want to download this file once its created. Any idea how can we download this with fast-csv?

Error: write after end

https://gist.github.com/devTristan/5aa69c2822a0da3f5e51

If the stream fast-csv is piped to doesn't always write immediately, I get a write after end error. This happens if it waits 100ms every 100 records, or every 10ms on every other record, or pretty much any other combination.

I've tried this with through and through2. I'm not familiar enough with streams to replicate this issue without using a helper module, so it's possible that this bug is present in both through and through2 instead of fast-csv. However, I'm pretty sure this is a fast-csv thing, because I can't replicate it using other csv readers.

Commenting out this line prevents this issue from happening, but it also causes a test to fail (specifically, the github issue test for #68).

Help me, @doug-martin, you're my only hope.

csv skips rows when writing

This code, which should copy a file, will not work for moderately large files (~50 mb)

var csv = require('fast-csv');

csv
.fromStream(fs.createReadStream("./input.csv"), {
    headers: true,
    ignoreEmpty: true
})
.transform(function(row) {
    return row;
})
.on("end", function() {
    console.log("done")
})
.pipe(csv.createWriteStream({
}))
.pipe(fs.createWriteStream("./output.csv", {
    encoding: "utf8"
}));

The result is many rows will be missing in the output file. My guess is you're not waiting for the 'drain' event before writing to file.

Semicolon support

It would be a great feature to have semicolon support when reading csv files. In many countries the csv file format is delimited by semicolons and not commas.

How to transform input csv to output csv via pipes

I'm trying to add an extra column to a csv file and would like to do this via streams utilising a pipe. This is what I have so far:

var fs = require('fs');
var csv = require("fast-csv");

var input = fs.createReadStream("input.csv");
var output = fs.createWriteStream("output.csv", {encoding: 'utf-8'});

var count = 1;

var csvStream = csv({rtrim: true, headers: true})
    .on("record", function(data){
        data.ordinalValue = count;
        count = count + 1;
     })
     .on("end", function(){
        console.log("done");
     });

input.pipe(csvStream).pipe(output);

It just puts a stringified JSON object though..

Writes [object Object] to csv file?

According to docs I can write arrays of objects. However the resulting .csv file has '[object Object]' field after field, instead of the object's fields and then rows of the objects' values.

function makeReport(stamp, cb){
    var outputDir = getDir(stamp);
    var file = path.join(outputDir, 'shortcode_report.csv');
    var report = [];
    for(var k in shortcodeStats)
        report.push(shortcodeStats[k]);
    writeReport(file, report, cb);
}

function writeReport(file, data, cb){
    var existsAlready = fs.existsSync(file);
    var csvStream = csv.createWriteStream({
        headers: !existsAlready,
        includeEndRowDelimiter: true,
    });
    csvStream.on('finish', function(err){
        cb(err, file);
    });
    csvStream.pipe(fs.createWriteStream(file, {flags:'a'}));
    csvStream.write(data);
    csvStream.end();
}

stats object looks like this:

stats = {
    'shortcode': 'beftfit_325454',
    'boolean req failed occurance': 0,
    'eligibly failed occurance': 0,
    'boolean hois': '',
    'eligiblity hois': ''
};

output was:

[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Behavior change between 0.4.4 and 0.5.0

I recently updated an app using fast-csv from 0.4.4 to 0.5.0

It also uses async 0.9.0.

I have this code:

var async = require('async');
var csv = require('fast-csv');
var fs = require('fs');

var rows = [];

for(var i = 0; i< 100000; i++)
{
    rows.push({a:i, b:i+2, c:i+4});
}

var csvStream;
var fileCount = 0;
async.doWhilst(
    function(done) {
        var filename = "test-" + fileCount + ".csv";
        console.info("Writing to: ", filename);
        csvStream = csv.createWriteStream({headers:true});
        var writableStream = fs.createWriteStream(filename);
        csvStream.pipe(writableStream);

        async.doWhilst(
            function(done) {
                var row = rows.shift();
                console.info("Row: ", row);
                csvStream.write(row);
                setImmediate(done);
            },
            function() {
                console.info("Bytes: ", writableStream.bytesWritten);
                return writableStream.bytesWritten < 1024 * 512 && rows.length > 0;
            },
            function(err) {
                console.info("Done with: " + filename);
                csvStream.write(null);
                fileCount++;
            }
        );

        writableStream.on('finish', function() {
            console.info('finished');
            setImmediate(done);
        });
    },
    function() {
        return rows.length > 0;
    },
    function(err) {
        console.info("Done writing rows");
    }
);

I have found that with 0.4.4 this does as expected and creates two files test-0.csv and test-1.csv.

With 0.5.0 when the inner doWhilst() finishes the outer never restarts and execution stops. Very bizarre behavior. There must be some change in fast-csv causing this but I can't figure out what.

fast-csv error stream issues

I ran into an issue when using fast-csv and listening for the error event. It seems a fast-csv stream doesn't properly manage stream events. The example I have is I will thrown an exception outside of the stream and fast-csv stream error event will trigger instead of having it handled via the process.

Example can be found at https://gist.github.com/jgornick/cd2b8bf0d8ba76b5d747

How to skip first couple of lines?

We are parsing files from various sources and some of these files we need to potentially skip the first couple of lines. Can you provide some suggestions on how to achieve this? I didnt see any way to do this in the module api.

Thanks!
Tom

Empty last line breaks parser

Hi,

It seems that CSV files ending with an empty line fail to parse:

events.js:72
        throw er; // Unhandled 'error' event
              ^
Error: Parse Error: expected: '"' got: 'undefined'. at '
    at parseEscapedItem ([...]\node_modules\fast-csv\lib\parser.js:70:23)
    at ParserStream.parseLine [as parser] ([...]\node_modules\fast-csv\lib\parser.js:142:30)
    at ParserStream._parseLine [as _parse] ([...]\node_modules\fast-csv\lib\parser_stream.js:109:25)
    at ParserStream.extended.extend._flush ([...]\node_modules\fast-csv\lib\parser_stream.js:184:18)
    at ParserStream.<anonymous> (_stream_transform.js:130:12)
    at ParserStream.g (events.js:175:14)
    at ParserStream.EventEmitter.emit (events.js:117:20)
    at spreadArgs ([...]\node_modules\fast-csv\lib\parser_stream.js:17:21)
    at ParserStream.extended.extend.emit ([...]\node_modules\fast-csv\lib\parser_stream.js:217:13)
    at finishMaybe (_stream_writable.js:354:12)

I removed the line break manually and voila - parsing worked fine. I understand that this is rather an issue of unclean CSV data, but fixing this would make the parser more robust.

throwing err with large tsv

I have a 2M line tsv, and im getting this

events.js:72
    throw er; // Unhandled 'error' event
          ^
Error: no writecb in Transform class
    at afterTransform (_stream_transform.js:90:33)
    at TransformState.afterTransform (_stream_transform.js:74:12)
    at /Users/nickdestefano/node_modules/fast-csv/lib/parser/parser_stream.js:206:21
    at /Users/nickdestefano/node_modules/fast-csv/lib/parser/parser_stream.js:121:17
    at asyncIterator (/Users/nickdestefano/node_modules/fast-csv/lib/extended.js:29:17)
    at Object._onImmediate (/Users/nickdestefano/node_modules/fast-csv/lib/extended.js:18:37)
    at processImmediate [as _immediateCallback] (timers.js:345:15)

My code is using the writeStream. I am trying to use a readStream in here as well but I'm wondering if this should be good enough, or is there a cap on size?

var formatStream = csv
  .createWriteStream({headers: true})
  .transform(function(data){
      return {
          'Author': data['Author'],
          'GUID': data['GUID'],
          'Contents': data['GUID'].split(',').join(''),
          'Date(GMT)': moment(new Date(data['Date(GMT)'])).format('MM/DD/YYYY HH:mm:ss A')
      };
  });
csv
   .fromPath(file, {delimiter: '\t', quote:'�', objectMode:true, headers: true, escape:'′'})
   .pipe(formatStream)

.pipe(fs.createWriteStream(test, {encoding: "utf8"}));

Also I tried a few different csv parsing libraries and this is best I've seen so nice work.

Formatting Functions example does not work

Using [email protected]
The first example for "Formatting Functions" does not work.

The example:
var csvStream = csv.createWriteStream({headers: true}),
writableStream = fs.createWriteStream("my.csv");

writableStream.on("finish", function(){
console.log("DONE!");
});

csvStream.pipe(writableStream);
csvStream.write({a: "a0", b: "b0"});
csvStream.write({a: "a1", b: "b1"});
csvStream.write({a: "a2", b: "b2"});
csvStream.write({a: "a3", b: "b4"});
csvStream.write({a: "a3", b: "b4"});
csvStream.end();

Results in:
events.js:72
throw er; // Unhandled 'error' event
^
TypeError: Object # has no method 'end'

What am I doing wrong?

Thanks

npm install / update is bloated

Through the dependency chain, a very old version of grunt-contrib-jshint is being loaded that references a bloated tarball of esprima.

The chain occurs here:

fast-csv -> string-extended -> array-extended -> grunt-contrib-jshint (~0.4.3) -> jshint -> esprima

and also here:

fast-csv -> string-extended -> number-extended -> grunt-contrib-jshint (~0.4.3) -> jshint -> esprima

There are a number of dependent projects from Doug Marint that are installing grunt and grunt related modules even on npm install --production.
I believe updates to these modules would fix the bloated dependency tree.

async doesnt seem to work

I'm having trouble with the async next function

If I have this code:
.validate(function (data, next) {
console.log('data=======', next);

The result of next is 0: data======= 0

and crashes the app:
next(valid);
^
TypeError: number is not a function

I'm using version: '0.5.2'

What am I doing wrong?

Thanks

Parser doesn't work after 2.2 milions of entries

I am trying to use fast csv but it stopped working after around 2.3 milions of entries when trying to read a file.

var csvStream = csv
    .fromStream(stream, {delimiter : ';', ignoreEmpty: true})
    .validate(function(data, next) {
        //console.log('validating');
        if (videosToBeInserted.length >= 500) {
            Entity.collection.insert(videosToBeInserted, {}, function(error, docs) {
                if (error) {
                    console.log(error);
                }
                else {
                    console.info('%d entities were successfully stored.', docs.length);
                    success += docs.length;
                }
                videosToBeInserted = [];
                next(null, true);
            });
        }
        else {
            next(null, true);

        }

    })
    .on("data", function(data) {

        if (counter >= counterOffset) {
            var entity = createEntity(data);
            console.log('Entity: ' + counter + ' - ' + entity.externalId);
            videosToBeInserted.push(entity);
        }
        counter++;
    })
    .on("end", function(){
        console.log("done");
        console.log('Total entries: ' + counter);
        console.log('Success: ' + success);
        console.log('Errors: ' + counter - success);
        console.log('Parsing errors: ' + parsingErrors);
    })
    .on('error', function(error) {
        console.log("Catch an invalid csv file!!!");
        console.log(error);
        parsingErrors++;

    });

stream.pipe(csvStream);

I tried multiple times to run it but it just freezes.

Streams 2 support

Really like this csv library compared to others I've found, but streaming is a little clunky. Would prefer to do this:

in.pipe(csv({headers: true})).pipe(process.stdout)

rather than this:

csv(in, {headers: true}).on('data', function(d){console.log(d)}).parse();

TypeError at Object.keys

This snippet does not work in webkit (desktop app):

var csvStream = csv
    .createWriteStream({headers: true})
    .transform(function(row){
        return {
           A: row.a,
           B: row.b
        };
    }),
    writableStream = fs.createWriteStream("my.csv");

writableStream.on("finish", function(){
  console.log("DONE!");
});

csvStream.pipe(writableStream);
csvStream.write({a: "a0", b: "b0"});
csvStream.write({a: "a1", b: "b1"});
csvStream.write({a: "a2", b: "b2"});
csvStream.write({a: "a3", b: "b4"});
csvStream.write({a: "a3", b: "b4"});
csvStream.write(null);

My stacktrace:

TypeError
    at Object.keys (/home/ivan/Workspace/js/neuroskyvis/app/node_modules/fast-csv/node_modules/object-extended/index.js:116:23)
    at CsvTransformStream.extended.extend.write (/home/ivan/Workspace/js/neuroskyvis/app/node_modules/fast-csv/lib/formatter.js:126:51)
    at HTMLInputElement.<anonymous> (file:///home/ivan/Workspace/js/neuroskyvis/app/dist/app.min.js:121:19)
    at HTMLInputElement.jQuery.event.dispatch (file:///home/ivan/Workspace/js/neuroskyvis/app/dist/vendor.min.js:4409:9)
    at HTMLInputElement.elemData.handle (file:///home/ivan/Workspace/js/neuroskyvis/app/dist/vendor.min.js:4095:28)

I'm falling back to csv module for now.

formatter.js: Disabling quote doesn't work

When I try to disable the quoting like this:

var csv = require('fast-csv');

var csvStream = csv.createWriteStream({
  quote : null
});

it's not working, because the null value of options.quote is ignored while setting QUOTE in formatter.js:

QUOTE = options.quote || '"',

When I use the following option

var csvStream = csv.createWriteStream({
  quoteColumns : false
});

the fields are escaped / quoted, too.

Example

What I want: my text;text with " in it;another text
What I get: my text;"text with "" in it";another text

.transform() async callback?

Is it possible to do an async operation in the transform method?

.transform(function(data, done){
   mongoose.model('user').findOne({ username: data.username }, function(err, result){
     if(err) throw err;
     return done({ userId: result._id });
  }
});

No parsing occurs in certain contexts

In certain contexts, like Meteor publish functions, or io.sockets.on('connection') functions, it appears that .parse() doesn't run. By contrast, the csv module does parse data in the very same context. Here is sample code that reproduces the bug:

'use strict';
var fs = require('fs');
var stream = fs.createReadStream("bug.csv");
var csv = require("fast-csv");

var io = require('socket.io').listen(8901);

io.sockets.on('connection', function (socket) {

  console.log("On connection");
  csv(stream, {headers : true})
    .on("data", function (d) {
      console.log(d);  // never here
    })
    .on("end", function () {
      console.log("done");  // never here
    })
    .parse();
  console.log("After csv.parse(), which didn't happen");

});

The client code just connects to the server:

<!doctype html>
<html>

<head>
<script src="https://cdnjs.cloudflare.com/ajax/libs/socket.io/0.9.10/socket.io.min.js"></script>

<body>

<script>
var websocket = io.connect('ws://localhost:8901');
</script>

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.