Giter Club home page Giter Club logo

csv-validate's Introduction

csv-validate: Validating CSV Parser

This JS CSV validating parser is based on csv-parse and csv-sniffer. It features data streaming, error tracing, and automatic identification of the CSV delimiters and quotes.

csv-validate provides both NodeJS module (NPM), optional console UI (CLI), and browser/Web UI for the local validation of CSV files.

\author Artem Lutov <[email protected]>
\affiliation Zazuko GmbH, Lutov Analytics
\license Apache 2

Web UI

The Web UI bundle is located in dist/ and consists of 3 files (and optional test.csv example to test the web app):

dist/
  index.html
  styles.css
  main.js

To build the Web UI, just execute npm run build from the package directory, producing the main.js bundle in dist/. Afterward, open index.html in your browser to use the web app: WebUI screenshot

VuePress Component

The VuePress component is located in dist/VuePress/ and relies on dist/main.js. To install it into your VuePress, build the WebUI bundle, and then copy:

$ cp dist/main.js ${VUEPRESS_ROOT}/
$ cp dist/VuePress/CsvValidate.vue ${VUEPRESS_ROOT}/components/
$ cp dist/VuePress/csv-validate.md ${VUEPRESS_ROOT}/..

where ${VUEPRESS_ROOT} is typically <WEBSITE_APP>/.vuepress. VuePress CsvValidate Component screenshot

CLI

Console UI provides the following interface:

$ npx csv-validate -h
Usage: csv-validate [options] <filename>

Options:
  -r, --relax-column-count   relax column count instead of emitting an error
  -s, --skip-error-lines     skip lines with errors instead of emitting an
                             error
  -d, --delimiter <symbol>   enforce specified CSV delimiter instead of
                             inferring it
  -q, --quotes <l-r-quotes>  left [and right if distinct] quote symbols instead
                             of inferring them (default: "")
  -n, --new-line <string>    enforce specified CSV new line (e.g., \r\n)
                             instead of inferring it
  -e, --encoding <string>    file encoding (default: "utf8")
  -h, --help                 display help for command

To install it, execute:

$ npm install [-g] csv-validate

Otherwise, the executable can be called from the package sources as:

$ bin/csv-validate.js -h

Example:

$ bin/csv-validate.js test/csv/HDB*.csv

Processing: test/csv/HDB_GLOSSAR.csv
Processing progress: 12 %
inpPartProc()> Inferred delimiter: ,
inpPartProc()> Inferred quoteChar: "
inpPartProc()> Inferred newLine (codepoints):  0xD 0xA
ERROR in test/csv/HDB_GLOSSAR.csv: Error: Invalid Record Length: columns length is 10, got 3 on line 42

Processing: test/csv/HDB_KENZAHLEN.csv
Processing progress: 14 %
inpPartProc()> Inferred delimiter: ,
inpPartProc()> Inferred quoteChar: "
inpPartProc()> Inferred newLine (codepoints):  0xA
Processing progress: 25 %
ERROR in test/csv/HDB_KENZAHLEN.csv: Error: Invalid Record Length: columns length is 14, got 11 on line 167

Processing: test/csv/HDB_QUELLE.csv
Processing progress: 27 %
inpPartProc()> Inferred delimiter: ,
inpPartProc()> Inferred quoteChar: "
inpPartProc()> Inferred newLine (codepoints):  0xD 0xA
Processing progress: 37 %
Completed test/csv/HDB_QUELLE.csv

Processing: test/csv/HDB_RAUM_fail.csv
Processing progress: 41 %
inpPartProc()> Inferred delimiter: ,
inpPartProc()> Inferred quoteChar: "
inpPartProc()> Inferred newLine (codepoints):  0xD 0xA
Processing progress: 55 %
Processing progress: 68 %
Processing progress: 82 %
Processing progress: 95 %
Processing progress: 100 %
ERROR in test/csv/HDB_RAUM_fail.csv: Error: Invalid Record Length: columns length is 24, got 22 on line 3810

CSV Validation Summary:
test/csv/HDB_GLOSSAR.csv    FAIL
    Error: Invalid Record Length: columns length is 10, got 3 on line 42
test/csv/HDB_KENZAHLEN.csv    FAIL
    Error: Invalid Record Length: columns length is 14, got 11 on line 167
test/csv/HDB_QUELLE.csv    OK
test/csv/HDB_RAUM_fail.csv    FAIL
    Error: Invalid Record Length: columns length is 24, got 22 on line 3810

NodeJS Module

The module interface abstracts csv-parse and csv-sniffer.

Interface

The main interface is represented by a single function (see index.js: Parser):

static import (input: readable-stream.Readable, options: Object)

where the possible options are (see index.js: Parser constructor):

relaxColumnCount: boolean,  // Default: undefined (false)
skipLinesWithError: boolean,  // Default: undefined (false)
delimiter: string,  // Item delimiter, e.g. ',' or ' '. Default: automatically inferred
quotes: string,  // Item quotation symbol, e.g. '\'' or '"'. Default: automatically inferred
newLine: string  // A record separator, which is a newline character set, e.g., '\n' or '\n\r'. Default: automatically inferred

Usage

The interface can be used as follows:

const CsvValidatingParser = require('csv-validate"')
const { PassThrough } = require('readable-stream')

const input = new PassThrough()
input.write('key1,key2\n')
input.write('value1_1,value2_1\n')
input.write('value1_2,value2_2\n')
input.end()

CsvValidatingParser.import(input, { newLine: '\n' })

See test/interface.test.js for more examples.

csv-validate's People

Contributors

l00mi avatar luav avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.