Giter Club home page Giter Club logo

node-csv's Introduction

     _   _           _        _____  _______      __
    | \ | |         | |      / ____|/ ____\ \    / /
    |  \| | ___   __| | ___ | |    | (___  \ \  / /
    | . ` |/ _ \ / _` |/ _ \| |     \___ \  \ \/ /
    | |\  | (_) | (_| |  __/| |____ ____) |  \  /
    |_| \_|\___/ \__,_|\___| \_____|_____/    \/     MIT License

CSV packages for Node.js and the web

This project provides CSV generation, parsing, transformation and serialization for Node.js.

It has been tested and used by a large community over the years and should be considered reliable. It provides every option you would expect from an advanced CSV parser and stringifier.

Project structure

This repository is a monorepo managed using Lerna. There are 5 packages managed in this codebase, even though we publish them to NPM as separate packages:

Documentation

The full documentation for the current version is available on the official CSV project website.

Features

  • Extends the native Node.js transform stream API
  • Simplicity with the optional callback and sync API
  • Support for ECMAScript modules and CommonJS
  • Large documentation, numerous examples and full unit test coverage
  • Few dependencies, in many cases zero dependencies
  • Node.js support from version 8 to latest
  • Mature project with more than 10 years of history

License

Licensed under the MIT License.

Contributors

The project is sponsored by Adaltas, a Big Data consulting firm based in Paris, France.

node-csv's People

Contributors

aghost-7 avatar ajaz-ur-rehman avatar ajitkaller avatar bokub avatar cmbuckley avatar dandv avatar dhull avatar dougwilson avatar drdmitry avatar drl-max avatar eladb avatar ffflorian avatar hakatashi avatar igor-savin-ht avatar jmarca avatar jonseymour avatar jpschorr avatar khorwood avatar kibertoad avatar markstos avatar nuxij avatar odykyi avatar olalonde avatar rauno56 avatar varunoberoi avatar wdavidw avatar willfarrell avatar winton avatar x3cion avatar yonilerner avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

node-csv's Issues

Line breaks inside of quote delimited fields don't work

fix:

            case '\n':
                if(state.quoted) {
                    state.field += c;
                    break;
                }
                if( !csv.readOptions.quoted && state.lastC === '\r' ){
                    break;
                }
            case '\r':
                if(state.quoted) {
                    state.field += c;
                    break;
                }
                if( csv.writeOptions.lineBreaks === null ){
                    csv.writeOptions.lineBreaks = c + ( c === '\r' && chars.charAt(i+1) === '\n' ? '\n' : '' );
                }

Is this totally async?

I have a CSV string in the form of a variable, and all I want is to parse it and export that variable. I don't understand this .to business.

No newline present on last line of to.path()

I'm writing to a file using to.path() and there's no newline on the last row inserted. This means that when I append to the file later on (using to.path() again) the first line of the second write is on the last line of the last write.

Is there anyway to force CSV to write a newline at the end of a write? I've tried using on("end", ... but can get it to work.

Thanks.

parsing more than one file

Hi,
I tried to parse two csv files simultaneously, and the data record from one file got mingled with a callback function from another csv reading invocation.
Generally, I'd like to use your module like this:;

var CSV = require('csv');
var csv1 = new CSV();
var csv2 = new CSV();

csv1.fromPath('1.csv'.on('data', cb1);
csv2.fromPath('2.csv'.on('data', cb2);

improperly escaped double quotes throw errors

this is technically correct behavior although in practice many CSV parsers are able to correctly parse unescaped double quotes

"98174","Permitted Site; undocumented","Lophostemon confertus :: Brisbane Box","2637 24th St","1","Sidewalk: Curb side : Cutout","Tree","Private","","","","3x3","","",""
"98175","Permitted Site; undocumented","Rhaphiolepis "Majestic Beauty" :: Indian Hawthorn  'Majestic Beau'","2637 24th St","2","Sidewalk: Curb side : 

the "Majestic Beauty" on the second line should instead be ""Majestic Beauty"" but since the double quotes aren't escaped I got an error Invalid closing quote; found "M" instead of delimiter ",". it would be nice if this module could try to auto escape double quotes!

package.json contains an extraneous ,

package.json contains an extraneous comma (,) between the contributors and directories elements of the package object. This is causing npm link/install to fail.

NPM

Please add npm install csv to the readme! Thanks for this project!

Feature Request - add async to the parser

Hi, I'm using this great library to parse CSV files and insert data into MySQL database.. will be great if this lib would have the positibility to tell when to continue with the parsing process. For example, in the event transform is there a posibility to add a new parameter 'callback'? This is for telling the CSV parser WHEN to continue sending data to us.. because if I insert a new MySQL row per 'data' event, the 'end' event fires before MySQL finished processing all.

Thanks in advance,

Example for converting an array to CSV string

I am having some trouble figuring out how to convert an array of objects to a CSV string. All the examples/documentation involves file output, while I want just a CSV string as output. How do I do that?

Thanks.

process.stdin not handled properly

I copied sample.js to sample-stdin.js then modified sample-stdin.js to use process.stdin.

csv()
.fromStream(process.stdin)

Then ran:

node sample-stdin.js < sample.in

This fails to produce any output.

Add stop event

Would it be possible to add a stop event to a stream? I'm parsing large files and at some points in time, I need to stop the stream and start another one.

Migration from 0.0.13 not sending EOF ?

This code used to work fine even when the input was big:

  cvs().from(object)
    .toStream(result, {end:true});

Now this one doesn't seem to be sending the "EOF"

  cvs().from(object)
    .to.stream(result, {end:true});

Convert zero values to nulls

When using from.array of numeric values that are potentially zeros, these values are converted to nulls by default.

Tracking it to transformer.js:
around line 120:

  lineAsObject = {};
  for (i = _j = 0, _len1 = columns.length; _j < _len1; i = ++_j) {
    column = columns[i];
    lineAsObject[column] = line[column] || null;
  }

In case of 0 (zero) the end result loses the difference between 0, null or undefined.

A correction could be:

    lineAsObject[column] = line[column];

Or something like it.

Otherwise, the output csv has missing zero numbers.

Am I the first to report this?

Thanks,

Ronnen

Injecting records into the output stream?

Have you considered a mechanism for injecting new records into the output stream?

For example, it would be useful for my purposes to be able to write a summary record into the output stream or to be able to duplicate a record in the output stream but I can't see a clean way to do this with the existing API.

Output new columns

Input:

column1,column2
hello,world
yoyo,dodo

Code:

csv()
  .fromPath(..., { columns: true })
  .transform(function(obj) {
    obj.column3 = 'the-awesome-column-' + obj.column1;
    return obj;
   }
  .to(process.stdout, { end: true });

I expected this to yield an output that contains all the columns (because columns is true on the read):

column1,column2,column3
hello,world,the-awesome-column-hello
yoyo,dodo,the-awesome-column-yoyo

But it yields only the first two columns:

column1,column2
hello,world
yoyo,dodo

I would suggest that if { columns: true } is provided on the reader and the writer doesn't have any columns specification (or also specifies {columns:true}, all the columns from all the objects should be emitted.

The way I would do it is whenever transform() returns, the column list will be updated based on the keys of the returned object.

Stop parsing within on('data') or transform()

I am attempting to do simple data validation within a .on('data') callback, and when I bump in to an error, I would like to stop the parsing. I've attempted to throw an error, but it seems that it bubbles up out of the csv package, ie. is not caught and passed on to .on('error').

csv().from('...').on('data', function(data, index) { throw new Error('insufficient data in message'); }).on('error', function(err) { console.log('I never get called :('); });

The stacktrace I am getting is

Error: insufficient data in message
    at [object Object].<anonymous> (/Users/gudmundur/Development/CSharp/Sources/vmkerfi/tools.ssBridge/lib/bridge.coffee:114:35)
    at [object Object].emit (events.js:70:17)
    at write (/Users/gudmundur/Development/CSharp/Sources/vmkerfi/tools.ssBridge/node_modules/csv/lib/csv.js:222:17)
    at flush (/Users/gudmundur/Development/CSharp/Sources/vmkerfi/tools.ssBridge/node_modules/csv/lib/csv.js:409:9)
    at [object Object].end (/Users/gudmundur/Development/CSharp/Sources/vmkerfi/tools.ssBridge/node_modules/csv/lib/csv.js:143:17)
    at Array.0 (/Users/gudmundur/Development/CSharp/Sources/vmkerfi/tools.ssBridge/node_modules/csv/lib/csv.js:83:18)
    at EventEmitter._tickCallback (node.js:192:40)

Am I going about this the wrong way or is this an actual issue?

Parsing files containing unicode characters

When I parse files that apparently contain some non printable unicode characters I get the entire parsed file as unicode characters. I am assuming this has something to do with default utf8 option. Can you please let me know what other options exist in parsing CSV files containing some western European characters?

Appreciate your help.

stdout cannot be closed

Node: v.0.6.14

Not sure if I'm doing something wrong here, but I get the following error with this code:

events.js:48
        throw arguments[1]; // Unhandled 'error' event
                       ^
Error: process.stdout cannot be closed.
    at WriteStream.<anonymous> (node.js:284:20)
    at WriteStream.end (net.js:242:10)
    at [object Object].end (/Users/bliss/code/acuhoi-scrape/node_modules/csv/lib/csv.js:163:37)
    at [object Object].<anonymous> (/Users/bliss/code/acuhoi-scrape/node_modules/csv/lib/csv.js:110:18)
    at [object Object].emit (events.js:64:17)
    at afterRead (fs.js:1117:12)
    at Object.wrapper [as oncomplete] (fs.js:254:17)

Maybe documentation should specify 1st parameter as "row" for transform & record event

Being slightly pedantic here, but instead of calling the 1st parameter "data" in the transform & record event, it should be called "row" to make things a little clearer.

Example:

csv = require 'csv'
fs = require 'fs'
csv()
.from.stream(fs.createReadStream('file.csv'))
.to.array (output) ->
    output # process output here
.transform (row) ->
    if row[0] <= 10 then row[1] else null

If you think it's a good idea, I can update the doc & submit a pull request if you like.

TSV parsing example

Hi, I simply want to parse my raw TSV text to array e.g.

str = 
"1    2    3
a    b    c"

to

arr = [[1,2,3], [a,b,c]]

but I couldn't figure out how to properly set delimiter option and so on
from tutorial and test code.

I'll be glad if you can show me this tutorial in more detail.
Thanks.

How to apply backpressure on async transform

Hi, I'm trying to read URLs from a CSV file, check whether they're available, and write out bad URLs to another file. I get through about 400, and then get "FATAL ERROR: CALL_AND_RETRY_2 Allocation failed - process out of memory", presumably because I'm not applying backpressure correctly to the emitting stream.

It seems like I need to use stream.pipe, but I don't understand how to do it.

var csv = require('csv')
var request = require('request');

csv()
.from.path('./urls.csv', {columns: true})
.to.path('./badurls.csv')
.transform(function(data, index, callback){
  checkImage(null,callback,data['main-image-url'],index)
})

function checkImage(err, callback, url, index) {
  if (url != "") {
    request.head(url, function(err, res) {
      console.log(index,res.statusCode,url);
      if (res.statusCode != 200) {
        callback(null,url+"\n");
      }
    })
  }
}

Thanks in advance for suggestions on how to apply backpressure. You might also want to use this example of an async transform, as it's more realistic than what you have now.

“Unknown encoding”

Did I write the encoding name wrong?

Code:

.fromPath(__dirname+'/kontoutskrift.csv',{
    delimiter: ';',
    columns: ['dato', 'forklaring', 'rentedato', 'ut', 'inn'],
    encoding: 'ISO-8859-1'
})

Error:

buffer.js:430

       throw new Error('Unknown encoding');


^

Error: Unknown encoding
    at Buffer.toString (buffer.js:430:13)
    at [object Object].write (string_decoder.js:35:19)
    at [object Object]._emitData (fs.js:1152:32)
    at afterRead (fs.js:1137:10)
    at Object.wrapper [as oncomplete] (fs.js:254:17)

Reading a few lines into an array

I couldn't find documentation to read CSV lines into an array so I coded this.

Would this be best practice, or is there a better way of doing it?

array = []
csv()
.from.stream(fs.createReadStream(csvFileName))
.transform (data, index) ->
    array.push data
    if parseInt(data[0]) == 10
        this.end()
        console.log array # or do whatever here

Appending seems to overwrite file

I'm trying this:

var fields = 'some,data,values';
var file = csv().from(fields).toPath('posts.csv', {flag: 'a'});

However, the posts.csv file gets overwritten and only contains the data in the fields variable, and none of its previous contents. I was expecting that fields be appended to the end of the file.

expresso -I test/transform.js fails

Hi,

This test fails:

 expresso -I test/transform.js

Output is:

uncaught: AssertionError: "20322051544,88017226,4.5,279378000000\n28392898392,88392926,8.3,949323600000" == "20322051544,88017226,4.5,279414000000\n28392898392,88392926,8.3,949359600000"
at [object Object]. (/home/jseymour/git/personal/node-csv-parser/test/transform.js:94:11)
at [object Object].emit (events.js:42:17)
at [object Object]. (/home/jseymour/git/personal/node-csv-parser/lib/csv.js:127:9)
at [object Object].emit (events.js:39:17)
at fs.js:931:12

Failures: 1

My execution environment is 64 bit Ubuntu 10.10. I am in the Australia/Sydney timezone

Performance drop from 0.0 to 0.2

Test used in ticket #63 showed a performance drop from 0.0.19 to 0.2.2 and consequent (up to 0.2.4 tested).

It could be due to the mandatory invocation of a transformer for each line (I don't think it was in 0.0.x).
In any case it's something worth inspecting IMHO.

document has no headers if you pass in an empty array

e.g.:

csv().from.array([]).to.path('temp.csv', {header: true, columns: ['these', 'are', 'my', 'headers']}).on('close', function(count) {console.log count})

I would expect this to produce a document with a header row and nothing else, but it doesn't even have the header row and instead makes a 0 byte document

Output stream ending when {end: false} passed in "to"

I'm trying to leave my output stream open so that I can transform and concatenate a set of CSV files. I've tried passing {end: false} in the to call:

csv().from(csv_path).transform(process_record).to(out_stream, {end: false}).......

But the output stream is still being end()ed after each csv file end.

I notice in to.js, the "options" argument is not being passed to the pipe command.

to.stream = function(stream, options) {
.
.
.
csv.pipe(stream);
.
.

Is this correct or should that be csv.pipe(stream, options);?

null values

is there a way for the csv writer to print a blank entry for a null value instead of 'null'?

always quote option

In the writeOptions I added an option called alwaysQuote which is set to false by default.
On line 445 I changed it from
if(containsQuote || containsdelimiter || containsLinebreak){
to
if(containsQuote || containsdelimiter || containsLinebreak || csv.writeOptions.alwaysQuote){

that way I can force every column to be wrapped in the quote char

string needs newline otherwise incorrectly parsed

csv = require 'csv'
str = '"1","2","3","4","5",""'
csv().from(str,',').on 'data', (data) -> console.log data.length
str = '"1","2","3","4","5",""\n'
csv().from(str,',').on 'data', (data) -> console.log data.length

output:

5
6

bug or a feature?

Empty values are transformed to a quote character

When the parser encounters an empty value, the result is a single quote character. Here is a simple script duplicating the bug. The last two values should probably be null.

var csv = require('csv'),
    str = '"01/20/2010","19:15:00","Liquor Laws","200-298 block of NW 1ST AVE, PORTLAND, OR 97209","CHINA/OLD TOWN","PORTLAND PREC CE","822","",""\n"01/07/2010","16:25:00","Liquor Laws","200-298 block of NW 1ST AVE, PORTLAND, OR 97209","CHINA/OLD TOWN","PORTLAND PREC CE","822","",""'

csv()
  .from(str)
  .on('data', function(d) {
    console.log(d); 
  })

More documentation needed

Especially on the various options(e.g. input stream, encoding)
Of course I could look up the source code to figure out where the options are, but it still takes some efforts...

parser chokes on escaped quotes in fields with "\""

for the valid CSV input:

0,"\"qux\""

the parser chokes:

[Error: Invalid closing quote; found "q" instead of delimiter ","]

seems the parser is not handling the escaped quotes inside the string. I don't have the time to do a pull request now, maybe I can loop back around for it... In the meantime I need to find a usable CSV parser...

double quotes in a tab-delimited file throws error

csv = require 'csv'
str = '1\t2\t3'
csv().from(str, {delimiter:'\t'}).on('data', (data) -> console.log data.length).on('error', (err) -> console.log err)
str = '"1"\t2\t3'
csv().from(str, {delimiter:'\t'}).on('data', (data) -> console.log data.length).on('error', (err) -> console.log err)
str = '"1" asdf\t2\t3'
csv().from(str, {delimiter:'\t'}).on('data', (data) -> console.log data.length).on('error', (err) -> console.log err)

output:

3
3
[Error: Invalid closing quote; found " " instead of delimiter " "]

mocha.opts

Not sure if you knew this, but if you create a file called mocha.opts in your test directory with the contents --compilers coffee:coffee-script, you can run mocha in your project home directory and it'll run the coffeescript tests without you having to specify that the tests are in coffeescript.

multiple columns with the same name

Say I have a bunch of objects structured like: {field1: val1, field2: val2, field3: val3} and I want to write them to a file with the headings in a specific order. Normally I'd use the "columns" option to give it an ordering (e.g. ["field1", "field3", "field2"]).

What would I do if I want all of those fields to have the same name in the outputted csv? e.g. the header row would look like "field,field,field"

As far as I can tell, there's no way to do that currently. I was trying to think of the best way to add that support and nothing really stuck out - I couldn't just modify columns to take in an object, because the key:val pairings there are unordered. It seems the way to do it would be able to optionally allow columns to take in an array of arrays, e.g. [["field1", "field"], ["field3", "field"], ["field2", "field"]].

So I guess what I'm asking: if I submitted a pull request that made that change, would you accept it?

transformed objects with boolean field values die

using the transform(transformer) as specified below:

function transformer(incoming) {
return {one:incoming[0],
two:incoming[1]=="True" }
}

throws an error:

TypeError: Object true has no method 'indexOf'
at /opt/foo/node_modules/csv/lib/csv.js:253:55
at Array.forEach (native)
at write (/opt/foo/node_modules/csv/lib/csv.js:242:22)
at flush (/opt/foo/node_modules/csv/lib/csv.js:407:9)
at parse (/opt/foo//node_modules/csv/lib/csv.js:366:21)
at [object Object]. (/opt/foo/node_modules/csv/lib/csv.js:92:17)
at [object Object].emit (events.js:67:17)
at [object Object]._emitData (fs.js:1117:29)
at afterRead (fs.js:1101:10)
at Object.wrapper as oncomplete

process.stdout cannot be closed

When I use

.toStream(process.stdout)

node.js throws the error:

Error: process.stdout cannot be closed 

Is there an option to not close the stream?

Full example:

var csv = require('csv');

var arr = [
  [1,2,3,4,5],
  [2,4,6,8,10]
];

csv()
  .from(arr)
  .toStream(process.stdout);  //thows on csv.js line 150

BOM?

I think the BOM is preventing reading of my first column header. Is there any option that I can use to have csv-parser ignore it?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.