evan / ccsv Goto Github PK
View Code? Open in Web Editor NEWA pure-C CSV parser for Ruby
Home Page: http://blog.evanweaver.com/files/doc/fauna/ccsv/
License: Academic Free License v3.0
A pure-C CSV parser for Ruby
Home Page: http://blog.evanweaver.com/files/doc/fauna/ccsv/
License: Academic Free License v3.0
When attempting to parse a file with the following bytes, a double free occurs.
BD 22 5C 0A 0A
Tested on ubuntu, with ruby 2.4.2
.
gdb debug:
gdb --batch -q --ex=r --ex 'back' --ex 'disass $pc, $pc+16' --ex 'info reg' --ex 'quit' --args /usr/local/bin/ruby /data/ccsv/ext/test.rb file_containing_crash_bytes 0</dev/null
gdb output:
*** Error in `/usr/local/bin/ruby': double free or corruption (fasttop): 0x0000000002116b80 ***
======= Backtrace: =========
/lib/x86_64-linux-gnu/libc.so.6(+0x777e5)[0x7fb6195c57e5]
/lib/x86_64-linux-gnu/libc.so.6(+0x8037a)[0x7fb6195ce37a]
/lib/x86_64-linux-gnu/libc.so.6(cfree+0x4c)[0x7fb6195d253c]
/data/ccsv/ext/ccsv.so(+0x2f0d)[0x7fb618d33f0d]
/usr/local/lib/libruby.so.2.4(+0x596a98)[0x7fb61a810a98]
/usr/local/lib/libruby.so.2.4(+0x58869b)[0x7fb61a80269b]
/usr/local/lib/libruby.so.2.4(+0x583741)[0x7fb61a7fd741]
/usr/local/lib/libruby.so.2.4(+0x583510)[0x7fb61a7fd510]
/usr/local/lib/libruby.so.2.4(+0x58319b)[0x7fb61a7fd19b]
/usr/local/lib/libruby.so.2.4(+0x5411bf)[0x7fb61a7bb1bf]
/usr/local/lib/libruby.so.2.4(+0x578756)[0x7fb61a7f2756]
/usr/local/lib/libruby.so.2.4(rb_iseq_eval_main+0x838)[0x7fb61a7f5218]
/usr/local/lib/libruby.so.2.4(ruby_run_node+0x339)[0x7fb61a3ff479]
/usr/local/bin/ruby[0x4011d7]
/lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xf0)[0x7fb61956e830]
/usr/local/bin/ruby(_start+0x29)[0x401059]
======= Memory map: ========
00400000-00402000 r-xp 00000000 00:32 300 /usr/local/bin/ruby
00601000-00602000 r--p 00001000 00:32 300 /usr/local/bin/ruby
00602000-00603000 rw-p 00002000 00:32 300 /usr/local/bin/ruby
00603000-00613000 rw-p 00000000 00:00 0
01dae000-02123000 rw-p 00000000 00:00 0 [heap]
7fb614000000-7fb614021000 rw-p 00000000 00:00 0
7fb614021000-7fb618000000 ---p 00000000 00:00 0
7fb618b1b000-7fb618b31000 r-xp 00000000 00:32 1495 /lib/x86_64-linux-gnu/libgcc_s.so.1
7fb618b31000-7fb618d30000 ---p 00016000 00:32 1495 /lib/x86_64-linux-gnu/libgcc_s.so.1
7fb618d30000-7fb618d31000 rw-p 00015000 00:32 1495 /lib/x86_64-linux-gnu/libgcc_s.so.1
7fb618d31000-7fb618d35000 r-xp 00000000 00:32 1326 /data/ccsv/ext/ccsv.so
7fb618d35000-7fb618f34000 ---p 00004000 00:32 1326 /data/ccsv/ext/ccsv.so
7fb618f34000-7fb618f35000 r--p 00003000 00:32 1326 /data/ccsv/ext/ccsv.so
7fb618f35000-7fb618f36000 rw-p 00004000 00:32 1326 /data/ccsv/ext/ccsv.so
7fb618f36000-7fb618f46000 r-xp 00000000 00:32 335 /usr/local/lib/ruby/2.4.0/x86_64-linux/stringio.so
7fb618f46000-7fb619146000 ---p 00010000 00:32 335 /usr/local/lib/ruby/2.4.0/x86_64-linux/stringio.so
7fb619146000-7fb619147000 r--p 00010000 00:32 335 /usr/local/lib/ruby/2.4.0/x86_64-linux/stringio.so
7fb619147000-7fb619148000 rw-p 00011000 00:32 335 /usr/local/lib/ruby/2.4.0/x86_64-linux/stringio.so
7fb619148000-7fb61914a000 r-xp 00000000 00:32 318 /usr/local/lib/ruby/2.4.0/x86_64-linux/enc/trans/transdb.so
7fb61914a000-7fb619349000 ---p 00002000 00:32 318 /usr/local/lib/ruby/2.4.0/x86_64-linux/enc/trans/transdb.so
7fb619349000-7fb61934a000 r--p 00001000 00:32 318 /usr/local/lib/ruby/2.4.0/x86_64-linux/enc/trans/transdb.so
7fb61934a000-7fb61934b000 rw-p 00002000 00:32 318 /usr/local/lib/ruby/2.4.0/x86_64-linux/enc/trans/transdb.so
7fb61934b000-7fb61934d000 r-xp 00000000 00:32 316 /usr/local/lib/ruby/2.4.0/x86_64-linux/enc/encdb.so
7fb61934d000-7fb61954c000 ---p 00002000 00:32 316 /usr/local/lib/ruby/2.4.0/x86_64-linux/enc/encdb.so
7fb61954c000-7fb61954d000 r--p 00001000 00:32 316 /usr/local/lib/ruby/2.4.0/x86_64-linux/enc/encdb.so
7fb61954d000-7fb61954e000 rw-p 00002000 00:32 316 /usr/local/lib/ruby/2.4.0/x86_64-linux/enc/encdb.so
7fb61954e000-7fb61970e000 r-xp 00000000 00:32 43 /lib/x86_64-linux-gnu/libc-2.23.so
7fb61970e000-7fb61990e000 ---p 001c0000 00:32 43 /lib/x86_64-linux-gnu/libc-2.23.so
7fb61990e000-7fb619912000 r--p 001c0000 00:32 43 /lib/x86_64-linux-gnu/libc-2.23.so
7fb619912000-7fb619914000 rw-p 001c4000 00:32 43 /lib/x86_64-linux-gnu/libc-2.23.so
7fb619914000-7fb619918000 rw-p 00000000 00:00 0
7fb619918000-7fb619a20000 r-xp 00000000 00:32 139 /lib/x86_64-linux-gnu/libm-2.23.so
7fb619a20000-7fb619c1f000 ---p 00108000 00:32 139 /lib/x86_64-linux-gnu/libm-2.23.so
7fb619c1f000-7fb619c20000 r--p 00107000 00:32 139 /lib/x86_64-linux-gnu/libm-2.23.so
7fb619c20000-7fb619c21000 rw-p 00108000 00:32 139 /lib/x86_64-linux-gnu/libm-2.23.so
7fb619c21000-7fb619c2a000 r-xp 00000000 00:32 305 /lib/x86_64-linux-gnu/libcrypt-2.23.so
7fb619c2a000-7fb619e29000 ---p 00009000 00:32 305 /lib/x86_64-linux-gnu/libcrypt-2.23.so
7fb619e29000-7fb619e2a000 r--p 00008000 00:32 305 /lib/x86_64-linux-gnu/libcrypt-2.23.so
7fb619e2a000-7fb619e2b000 rw-p 00009000 00:32 305 /lib/x86_64-linux-gnu/libcrypt-2.23.so
7fb619e2b000-7fb619e59000 rw-p 00000000 00:00 0
7fb619e59000-7fb619e5c000 r-xp 00000000 00:32 41 /lib/x86_64-linux-gnu/libdl-2.23.so
7fb619e5c000-7fb61a05b000 ---p 00003000 00:32 41 /lib/x86_64-linux-gnu/libdl-2.23.so
7fb61a05b000-7fb61a05c000 r--p 00002000 00:32 41 /lib/x86_64-linux-gnu/libdl-2.23.so
7fb61a05c000-7fb61a05d000 rw-p 00003000 00:32 41 /lib/x86_64-linux-gnu/libdl-2.23.so
7fb61a05d000-7fb61a075000 r-xp 00000000 00:32 85 /lib/x86_64-linux-gnu/libpthread-2.23.so
7fb61a075000-7fb61a274000 ---p 00018000 00:32 85 /lib/x86_64-linux-gnu/libpthread-2.23.so
7fb61a274000-7fb61a275000 r--p 00017000 00:32 85 /lib/x86_64-linux-gnu/libpthread-2.23.so
7fb61a275000-7fb61a276000 rw-p 00018000 00:32 85 /lib/x86_64-linux-gnu/libpthread-2.23.so
7fb61a276000-7fb61a27a000 rw-p 00000000 00:00 0
7fb61a27a000-7fb61a905000 r-xp 00000000 00:32 303 /usr/local/lib/libruby.so.2.4.2
7fb61a905000-7fb61ab05000 ---p 0068b000 00:32 303 /usr/local/lib/libruby.so.2.4.2
7fb61ab05000-7fb61ab0b000 r--p 0068b000 00:32 303 /usr/local/lib/libruby.so.2.4.2
7fb61ab0b000-7fb61ab0e000 rw-p 00691000 00:32 303 /usr/local/lib/libruby.so.2.4.2
7fb61ab0e000-7fb61ab1e000 rw-p 00000000 00:00 0
7fb61ab1e000-7fb61ab44000 r-xp 00000000 00:32 36 /lib/x86_64-linux-gnu/ld-2.23.so
7fb61ac04000-7fb61ad3b000 rw-p 00000000 00:00 0
7fb61ad3c000-7fb61ad3d000 rw-p 00000000 00:00 0
7fb61ad3d000-7fb61ad3e000 ---p 00000000 00:00 0
7fb61ad3e000-7fb61ad43000 rw-p 00000000 00:00 0
7fb61ad43000-7fb61ad44000 r--p 00025000 00:32 36 /lib/x86_64-linux-gnu/ld-2.23.so
7fb61ad44000-7fb61ad45000 rw-p 00026000 00:32 36 /lib/x86_64-linux-gnu/ld-2.23.so
7fb61ad45000-7fb61ad46000 rw-p 00000000 00:00 0
7ffc1a668000-7ffc1ae67000 rw-p 00000000 00:00 0 [stack]
7ffc1ae6a000-7ffc1ae6c000 r--p 00000000 00:00 0 [vvar]
7ffc1ae6c000-7ffc1ae6e000 r-xp 00000000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
During startup program terminated with signal SIGABRT, Aborted.
"bar\"baz"
as "\"bar\"\"baz\""
"bar,baz"
as "\"bar"
and "baz\""
"\"\""
instead of empty string (""
)"foo"x"bar"
"Foo","Bar","Baz
"Foo","Bar", "Baz"
Please add.
multiline.csv
foo,"bar
baz",bzz
Expected output
CSV.foreach('multiline.csv').to_a # [["foo", "bar\nbaz", "bzz"]]
Actual output
rows = []
Ccsv.foreach(filename) {|row| rows << row}
rows # [["foo", "\"bar"], ["baz\"", "bzz"]]
CCSV:
[["foo\r"], ["bar\r"], ["baz\r"]]
CSV and all other CSV libraries tested:
[["foo"], ["bar"], ["baz"]]
No CSV specification escapes fields like in this test https://github.com/evan/ccsv/blob/master/spec/ccsv_spec.rb#L111
Similarly, column separators are not escaped with backslashes, but by using double-quoting https://github.com/evan/ccsv/blob/master/spec/ccsv_spec.rb#L76
Try with this file:
foo
Can't install gem from GitHub without a gemspec.
Currently if the file is encoded in non-UTF-8 format, it doesnt write the rows to the file. Is this something which can be handled ? like CSV.foreach(#{csvfilename},encoding:ISO-8859-1).
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.