Giter Club home page Giter Club logo

json2csv's Introduction

json2csv

Converts a stream of newline separated json data to csv format.

Build Status GitHub release

Installation

pre-built binaries are available under releases.

If you have a working golang install, you can use go install.

go install github.com/jehiah/json2csv@latest

Usage

usage: json2csv
    -k fields,and,nested.fields,to,output
    -i /path/to/input.json (optional; default is stdin)
    -o /path/to/output.csv (optional; default is stdout)
    --version
    -p print csv header row
    -h This help

To convert:

{"user": {"name":"jehiah", "password": "root"}, "remote_ip": "127.0.0.1", "dt" : "[20/Aug/2010:01:12:44 -0400]"}
{"user": {"name":"jeroenjanssens", "password": "123"}, "remote_ip": "192.168.0.1", "dt" : "[20/Aug/2010:01:12:44 -0400]"}
{"user": {"name":"unknown", "password": ""}, "remote_ip": "76.216.210.0", "dt" : "[20/Aug/2010:01:12:45 -0400]"}

to:

"jehiah","127.0.0.1"
"jeroenjanssens","192.168.0.1"
"unknown","76.216.210.0"

you would either

json2csv -k user.name,remote_ip -i input.json -o output.csv

or

cat input.json | json2csv -k user.name,remote_ip > output.csv

json2csv's People

Contributors

bryant1410 avatar daniel-levin avatar jehiah avatar pevans96 avatar seenickcode avatar simonschmidt avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

json2csv's Issues

Large numbers get parsed as smaller numbers

This may be a bug/feature of the json library this tool uses, but my CSV had a large number in one of the columns, and it was evidently parsed as different number (also negative). I'm guessing my number was larger than whatever type json2csv's json parse was expecting:

$ echo '{"x": 110750513297351875876238378613844499030}' |  /opt/gocode/bin/json2csv -k x
-9223372036854775808

Installation without go

Hi,

I don't have go, and have no plans of installing/using go in my work place.

Can you please add a pre-compiled json2csv version without go, and/or any run time dependency. I'd like to use this tool.

Thank you.

Use all available cores

json2csv uses a lot of CPU, but only one core. json.Unmarshal seems to be taking up most of the time so I made a forked version that launches a whole bunch of them in parallel

A 207Mb json file with 241804 lines takes 15s to convert to csv with the single-threaded version, using all cores (2 real + 2 hyperthreading) it takes 7s. However using just one core on the multi-threaded version takes 17s It might be worth to keep the single-threaded version in the code and use that when running on one core.

parallel version

JSON file

I have json file that looks like below. json2csv works if I make this each entry a single line. Is this format it fails. I there a way for json2csv handle file formatted in this way?
{
“count”:10,
“entries”:
[
{
“id”:20232913,
“application”:“AuditExampleLogin1”,
“user”:“TestUser”,
“time”:“2018-08-15T23:59:40.186-07:00”,
“values”:
null
},
{
“id”:20232914,
“application”:“AuditExampleLogin1”,
“user”:“AUser”,
“time”:“2018-08-15T23:59:55.186-07:00”,
“values”:
null
},

-k option not recognized

attempting to convert a json using the instructions on the readme, I get error: unknown option-k'`

This occurs even when using the exact data and command written on the readme, i.e., input.json:

[
	{"user": {"name":"jehiah", "password": "root"}, "remote_ip": "127.0.0.1", "dt" : "[20/Aug/2010:01:12:44 -0400]"},
	{"user": {"name":"jeroenjanssens", "password": "123"}, "remote_ip": "192.168.0.1", "dt" : "[20/Aug/2010:01:12:44 -0400]"},
	{"user": {"name":"unknown", "password": ""}, "remote_ip": "76.216.210.0", "dt" : "[20/Aug/2010:01:12:45 -0400]"}
]

json2csv -k user.name,remote_ip -i input.json -o output.csv

error: unknown option `-k'

Error decoding JSON at Line 1

Trying to test this and I am unable to get anything to output into my csv file through powershell.

I created a file with the JSON data from the README just copy/pasted (test2.json) and when running

json2csv -k user.name,remote_ip -i test2.json -o output.csv

it hangs and then gives the error "ERROR Decoding JSON at line 1: unexpected end of JSON input." I've attached the data as a text file for reference.

test2.txt

output is putting all results into a single row

I'm trying to parse through Instagram follower data, structured as such:

{
    close_friends: { 
        username: timestamp
    }
    followers: {
        follower1: timestamp1, 
        followerN: timestampN
    }
}

Using the command ./json2csv -i input.json -k followers -o output.csv outputs all followers+timestamp in a single row/cell. Is there a way to output 1 follower per row?

Thank you.

Make error

When running make:

Error received:

Updating goal targets....
File all' does not exist. Filejson2csv' does not exist.
Must remake target `json2csv'.
cc -I. -I/usr/local/include -O2 -g -o json2csv json2csv.c -L. -L/usr/local/lib -ljson
json2csv.c:156:15: error: non-void function 'parse_fields' should return a value [-Wreturn-type]
if (!str) return;
^
1 error generated.
make: *** [json2csv] Error 1

I have done the json-c pre-requisite changes.

please email me at [email protected] for more info

Thanks!

Comma-delimited data within json element

Thanks for your fantastically useful json2csv script - I've been using it to parse data from OpenLibrary dumps. It's working very well, even though the OL data is very inconsistently structured. One question, though, if I may...

In a case where there are commas within an item, eg

{"subjects": ["Books and reading -- Fiction.", "Storytelling -- Fiction.", "Death -- Fiction.", "Jews -- Germany -- History -- 1933-1945 -- Fiction."]}

json2csv appears to strip out the commas within the value, so the four different subjects all get merged into one. It comes out like this for -k subjects:

[Books and reading -- Fiction. Storytelling -- Fiction. Death -- Fiction. Jews -- Germany -- History -- 1933-1945 -- Fiction.]

Is there a straightforward way to get it to preserve those multiple items within a value? (I don't need them as separate fields in the CSV, but would like to preserve the distinction within the 'subjects' field, if you see what I mean - so they could be delimited by something other than a comma.)

(I tried using the -d flag to set a different field delimiter, e.g. semicolon, but it still stripped out the commas as above.)

Edit: another example...
"subject_places": ["United States", "China"]
comes out as
[United States China]
so it's not really practical to find some automated way of parsing that alas.

Strings are not escaped

If I have the source JSON

{"message": "this is a \"test quote\""}

then when I run json2csv -k message <JSON, the output looks like

"this is a "test quote""

which breaks any CSV reader. The escaping used in the JSON should be carried over to the CSV.

running problems

Hi!!

I've installed json2csv by

$go get github.com/jehiah/json2csv

but when I try to do

$cat someFile.json | json2csv -k time,tx_index > output.csv

then I get the following:

2015/04/28 17:18:20 ERROR Decoding JSON at line 1: unexpected end of JSON input
{
2015/04/28 17:18:20 ERROR Decoding JSON at line 2: invalid character ':' after top-level value
"time": 1348310820,
2015/04/28 17:18:20 ERROR Decoding JSON at line 3: invalid character ':' after top-level value
"tx_index": 9741630
2015/04/28 17:18:20 ERROR Decoding JSON at line 4: invalid character '}' looking for beginning of value
}

and so on... What am I missing?

Thank you for you time.

parse fail on unicode?

[anna@ ~]$ zcat /stream_archive/decodes.2012-06-10_11.log.gz | json2csv -k g | uniq > ~/test.cav
ERR: unable to parse json ({"a": "Mozilla/5.0 (Windows NT 5.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1", "c": "ID", "nk": 0, "tz": "Asia/Jakarta", "gr": "08", "g": "J5BebP", "i": "...", "h": "J5BebO", "k": "4fd4855a-00384-06073-271cf10a", "l": "mekaputra", "hh": "bit.ly", "r": "Dy\u00d9\u000b\uff8c\u00e1\uffb1\u0005<\uff95\u001e\fP\u00cd\u00df\u000b\u00d0\u00cd\u00dc\u0002\uff90\u0002\u0004\f\u012eo\u000b\u0010\u00c9\u0015\u0011P\u00c9\u0015\u0011", "u": "http://livebeta.kaskus.us/post/000000000000000694762851#post000000000000000694762851", "t": 1339327834, "hc": 1337025892, "kw": "movenkaskusbdg", "cy": "Malang", "ll": [-7.9797000885009766, 112.63040161132812]}
)

Nested indices

Does your code handle nested indices in json files?
I have a file that has "user" as an index, which has a nested index called "lang". "user.lang" doesn't seem to work, so I was wondering if there's any way to save "lang" in the csv.

bash: json2csv: command not found...

I'm sorry for asking a basic question but I'm new to golang. Running go get github.com/jehiah/json2csv on my CentOS 7 finished without any error but json2csv resulted in the error, bash: json2csv: command not found.... Please suggest where am I going wrong.
P.S. I had to reset GOROOT to allow installation to end successfully as suggested on this page.

Values from json array to seperate csv fields

I love this tool! I'm using it to convert json files from Twitter's streaming API to csv. I have very little programming skills, so this tool is really good for me.

However I'm stuck at this point where I would like to extract values from the coordinate pairs within tweets. I think the coordinates are stored as an array:

"coordinates": {"coordinates": [3.82955502, 51.48484907], "type": "Point"}

Would it be possible to store the X and Y-coordinate in a separate csv field?

I tried: "json2csv -k coordinates.coordinates[0],coordinates.coordinates[1] -i ...etc" but this doesn't work.

Thanks in advance!
Roeland

matrix how to covert ?

[
[1508834700000,"5688.5","5735.4","5676.51","5699.4","9.9216"],[1508835600000,"5699.4","5727","5699.4","5727","5.0476"],[1508836500000,"5727","5733.6","5644.31","5693.94","66.833"],
]

like this....
how to convert?
thx...

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.