Giter Club home page Giter Club logo

csv-parser-node's Introduction

Instructions

Aim is to write a package which converts CSV text input into arrays or objects. You have to use streams to accomplish the task.

Input

  • Your input can be a file path, a string containing a CSV code,

Output

  • Your output should be a stream so the pipe chain can continue unless overriden by the user. See below.

Mandatory Features

1. Follows the Node.js streaming API best practices.

  • Use async iterators and generators wherever you can.

2. Parse CSV files

YOUR_API("path to localFile/remote files (if you can do)", { ...YOUR OPTIONS });

3. Can convert JSON to CSV and vice-versa

YOUR_API('JSON STRING');
//=> outputs CSV (via stream? or string?)

YOUR_API('CSV STRING');
//=> outputs JSON  (via stream? or string?)

Note: Every JSON may not be valid CSV. Ask the developer for how to fill the gaps.

4. Support delimiters, quotes and escape characters

  • Ask yourself one question: Does the separator have to be only a comma? Can it be other characters?

5. Header row support

  • There may be an OPTIONAL header line appearing as the first line of the file with the same format as normal records. This header will contain names corresponding to the fields in the file, and MUST contain the same number of fields as the records in the rest of the file.
# INPUT
field_1,field_2,field_3¬
aaa,bbb,ccc¬
xxx,yyy,zzz¬

# OUTPUT (ignorting headers)
[ ["field_1", "field_2", "field_3"],
  ["aaa", "bbb", "ccc"],
  ["xxx", "yyy", "zzz"] ]

# OUTPUT (using headers)
[ {"field_1": "aaa", "field_2": "bbb", "field_3": "ccc"},
  {"field_1": "xxx", "field_2": "yyy", "field_3": "zzz"} ]
  • Give the power of transforming the header if I don't like the existing headers.
YOUR_API(
  `
"key_1","key_2"
"value 1","value 2"
`,
  {
    headers: (header) => header.map((column) => column.toUpperCase()),
  }
);
//=>
// [{
//	KEY_1: 'value 1',
//	KEY_2: 'value 2'
// }]

6. Spaces are considered part of a field and MUST NOT be ignored

7. You should give the ability to skip the default stream return value.

  • Perhaps by returning a promise that resolves with the entire output?
  • Or by taking a callback which is invoked when the task is complete?

8. Should be able to handle errors, continue parsing even for ambiguous CSV's and return the errors for each row

  • The user should be able to choose whether to stop on the first error, or skip the ill-formed line, or get an collection of all ill-formed lines at the end or something that you can think of.

9. Skip commented line

const data = `
# At the begening of a record
"hello"
"world"# At the end of a record
`.trim();

YOUR_API(data);
// output
// [
//  [ "hello" ],
//  [ "world" ]
// ]

10. Test coverage

  • Your API should be extensively tested using Jest.

Note: YOUR_API can be a single function like parse. It can also be replaced with something like CSV.parse. Feel free to invent your own API.

Extra Features (not mandatory)

  • Make your library Universal JavaScript (aka isomorphic JavaScript).
  • Support worker thread by passing a config. That is, the computation will not happen on main thread but on worker threads.
  • Auto detect the delimiter
YOUR_API.detect(CSV_STRING);
//=> "\t"

Restrictions

  • You should not be using any extra libraries.
    • Apart from eslint, prettier, babel or other helper utilities.
  • If your library is not using streams, it won't be evaluated.
  • If your library does not have tests, it won't be evaluated.

CSV Rules

  • Take into account that the comma (,) character is not the only character used as a field delimiter. Semi-colons (;), tabs (\t), and more are also popular field delimiter characters.
  • Each record starts at the beginning of its own line, and ends with a line break (shown as ¬)
# INPUT
aaa,bbb,ccc¬
xxx,yyy,zzz¬

# OUTPUT
[ ["aaa", "bbb", "ccc"],
  ["xxx", "yyy", "zzz"] ]
  • The last record in a file is not required to have a ending line break
# INPUT
aaa ,  bbb , ccc¬
 xxx, yyy  ,zzz ¬

# OUTPUT
[ ["aaa ", "  bbb ", " ccc"],
  [" xxx", " yyy  ", "zzz "] ]
// Two-line, comma-delimited file
var csv = myCSV.unparse([
  ['1-1', '1-2', '1-3'],
  ['2-1', '2-2', '2-3'],
]);

Reference

https://csv-spec.org/

csv-parser-node's People

Watchers

Satyam Bansal avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.