Giter Club home page Giter Club logo

edn_turbo's Introduction

edn_turbo 0.8.0

Fast Ragel-based EDN parser for Ruby.

edn_turbo can be used as a parser plugin for edn-ruby. With a few exceptions edn_turbo provides the same functionality as the edn gem, but since the edn_turbo parser is implemented in C++, it is an order of magnitude faster.

Some quick sample runs comparing time output of file reads using edn and edn_turbo (see issue 12):

irb(main):001:0> require 'benchmark'
=> true
irb(main):002:0> require 'edn'
=> true
irb(main):003:0> s = "[{\"x\" {\"id\" \"/model/952\", \"model_name\" \"person\", \"ancestors\" [\"record\" \"asset\"], \"format\" \"edn\"}, \"id\" 952, \"name\" nil, \"model_name\" \"person\", \"rel\" {}, \"description\" nil, \"age\" nil, \"updated_at\" nil, \"created_at\" nil, \"anniversary\" nil, \"job\" nil, \"start_date\" nil, \"username\" nil, \"vacation_start\" nil, \"vacation_end\" nil, \"expenses\" nil, \"rate\" nil, \"display_name\" nil, \"gross_profit_per_month\" nil}]"
=> "[{\"x\" {\"id\" \"/model/952\", \"model_name\" \"person\", \"ancestors\" [\"record\" \"asset\"], \"format\" \"edn\"}, \"id\" 952, \"name\" nil, \"model_name\" \"person\", \"rel\" {}, \"description\" nil, \"age\" nil, \"updated_at\" nil, \"created_at\" nil, \"anniversary\" nil, \"job\" nil, \"start_date\" nil, \"username\" nil, \"vacation_start\" nil, \"vacation_end\" nil, \"expenses\" nil, \"rate\" nil, \"display_name\" nil, \"gross_profit_per_month\" nil}]"
irb(main):004:0> Benchmark.realtime { 100.times { EDN::read(s) } }
=> 0.083543
irb(main):005:0> Benchmark.realtime { 100000.times { EDN::read(s) } }
=> 73.901049
irb(main):006:0> require 'edn_turbo'
=> true
irb(main):007:0> Benchmark.realtime { 100.times { EDN::read(s) } }
=> 0.007321
irb(main):008:0> Benchmark.realtime { 100000.times { EDN::read(s) } }
=> 2.866411

Dependencies

Ruby 2.6 or greater.

Notes:

  • edn_turbo uses a ragel-based parser but the generated .cc file is bundled so ragel should not need to be installed.
  • If your system updates the installed version of icu4c, you'll likely get symbol errors when trying to use edn_turbo as the libraries it was linked against when first installed will no longer exist. To resolve this, reinstall the gem so it is built against the new icu4c libraries.

Usage

Simply require 'edn_turbo' instead of 'edn'. Otherwise (with the exceptions noted below) the API is the same as the edn gem.

    require 'edn_turbo'

    File.open(filename) do |file|
       output = EDN.read(file)
       pp output if output != EOF
    end

    # also accepts a string
    pp EDN.read("[ 1 2 3 abc ]")

    # metadata
    e = EDN.read('^String ^:foo ^{:foo false :tag Boolean :bar 2} [1 2]')
    pp e          # -> [1, 2]
    pp e.metadata # -> {:foo=>true, :tag=>#<EDN::Type::Symbol:0x007fdbea8a29b0 @symbol=:String>, :bar=>2}

Or instantiate and reuse an instance of a parser:

    require 'edn_turbo'

    p = EDN::new_parser
    File.open(filename) do |file|
       output = p.parse(file)
       pp output if output != EOF
    end

    # with a string
    pp p.parse("[ 1 2 3 abc ]")


    # set new input
    s = "(1) :abc { 1 2 }"
    p.set_input(s)

    # parse token by token
    loop do
      t = p.read
      break if t == EOF

      pp t
    end

Differences with edn gem

  • edn_turbo reads String and core IO types using C-api calls. However, data from StringIO sources is extracted using read() calls into the ruby side.

  • As of v0.6.1, edn_turbo supports EDN ratio literals, returning a ruby Rational representation for them. See edn-format/edn#64.

  • As of v0.6.2, edn_turbo supports representation of ##Inf as Float::INFINITY and ##NaN as Float::NAN.

  • As of v0.7.1, edn_turbo requires ruby 2.5 or greater.

  • As of v0.7.4, edn_turbo requires ruby 2.6 or greater.

  • As of v0.8.0, edn_turbo replaces its edn-ruby with edn2023.

Building and running tests

bundle install
bundle exec rake
bundle exec rspec

edn_turbo's People

Contributors

andrerocker avatar caleb avatar edporras avatar jdliss avatar russolsen avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

edn_turbo's Issues

Homebrew now installs in /opt/homebrew, so the native config no longer works

I just got a new Mac and was surprised to find that the default location for Homebrew is now /opt/homebrew

This makes sense since there were conflicts with the original /use/local location, but it prevented edn_turbo from compiling.

The location of pkg-config is hardcoded to /usr/local/bin/pkg-config.

I did a quick patch on a fork that uses the mkmf pkg_config() function and sets the PKG_CONFIG_PATH to look for icu4c in /usr/local and /opt/homebrew locations:

https://github.com/caleb/edn_turbo/blob/55ab642bd10f71664f7be4370b8e66052390a7ba/ext/edn_turbo/extconf.rb#L27-L34

I don't know if you need to run pkg-config manually like you are doing currently, so I don't want to make a PR quite yet, but the above configuration seems to work.

Terminal corruption with pry

On MacOS, pry 0.10.3, ruby 2.3.3, iTerm2

pry -r edn_turbo

Causes the terminal to make C-c exit pry directly and iTerm to don't break lines anymore when I hit return and when I press arrow-up, no commands are shown, but they are executed if I press return.

On the other hand, irb -r edn_turbo works as expected.

FYI: My dirty fix was to have a Signal.trap('SIGINT') { puts '' } when launching the pry console.

edn_turbo doesn't work on truffleruby

I get an argument count error when trying to use edn_turbo on truffle ruby when I call EDNT::Parser#read. I have a patch that seems to fix it: #17 but I'm not sure if the extra argument was there for a reason...

Edit: There are other issues on Truffleruby, but I thought this fix might be applicable to MRI as well.

TypeError exception: can't define singleton

I just tried parse some EDN (routes for bidi) with some meta, and got this:

[margo] pry
[1] pry(main)> require 'edn_turbo'
=> true
[2] pry(main)> EDN.read('["/" {"index.html" :index "articles/" {"index.html" ^:list :article-index "article.html" :article}}]')
libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: TypeError exception: can't define singleton
[1]    99990 abort      pry
[margo] pry
[1] pry(main)> require 'edn_turbo'
=> true
[2] pry(main)> EDN.read('["/" {"index.html" :index "articles/" {"index.html" ^{:list true} :article-index "article.html" :article}}]')
libc++abi.dylib: terminating with uncaught exception of type std::runtime_error: TypeError exception: can't define singleton
[1]    189 abort      pry

Blank strings are returned encoded in ASCII-8BIT

Blank strings are returned encoded in ASCII-8BIT, which differs from the native localization. This results in an exception when calling string.unicode_normalize. This is different behavior than the edn gem.

Using edn_turbo:

irb(main):001:0> RUBY_VERSION
=> "3.1.2"
irb(main):002:0> require 'edn_turbo'
=> true
irb(main):003:0> my_string = EDN.read( '"my_string"' )
=> "my_string"
irb(main):004:0> blank_string = EDN.read( '""' )
=> ""
irb(main):005:0> my_string.encoding
=> #<Encoding:UTF-8>
irb(main):006:0> blank_string.encoding
=> #<Encoding:ASCII-8BIT>
irb(main):007:0> my_string.unicode_normalize
=> "my_string"
irb(main):008:0> blank_string.unicode_normalize
/usr/local/Cellar/ruby/3.1.2/lib/ruby/3.1.0/unicode_normalize/normalize.rb:141:in `normalize': Unicode Normalization not appropriate for ASCII-8BIT (Encoding::CompatibilityError)
  from (irb):8:in `unicode_normalize'
  from (irb):8:in `<main>'
  from /usr/local/Cellar/ruby/3.1.2/lib/ruby/gems/3.1.0/gems/irb-1.4.1/exe/irb:11:in `<top (required)>'
  from /usr/local/opt/ruby/bin/irb:25:in `load'
  from /usr/local/opt/ruby/bin/irb:25:in `<main>'

Using edn:

irb(main):001:0> RUBY_VERSION
=> "3.1.2"
irb(main):002:0> require 'edn'
=> true
irb(main):003:0> my_string = EDN.read( '"my_string"' )
=> "my_string"
irb(main):004:0> blank_string = EDN.read( '""' )
=> ""
irb(main):005:0> my_string.encoding
=> #<Encoding:UTF-8>
irb(main):006:0> blank_string.encoding
=> #<Encoding:UTF-8>
irb(main):007:0> my_string.unicode_normalize
=> "my_string"
irb(main):008:0> blank_string.unicode_normalize
=> ""

edn_turbo can get linked against wrong version of icu4c on OSX

These are the steps to reproduce it (on both Mojave and High Sierra):

• Install v8 via home brew
• Install icu4c via home brew
• Install the v8 gem (gem install therubyracer)
• Install your gem. In the compilation phase it’s linked against the wrong icu4c.

I hacked the Makefile to print out the arguments being supplied to the compiler and they are:

-L. -L/Users/brendan/.rbenv/versions/2.4.2/lib -L/usr/local/lib -L/usr/local/opt/icu4c/lib -L. -L/Users/brendan/.rbenv/versions/2.4.2/lib -fstack-protector -L/usr/local/lib -L/Users/brendan/.rbenv/versions/2.4.2/lib -Wl,-undefined,dynamic_lookup -Wl,-multiply_defined,suppress -licuuc -lpthread -lgmp -ldl -lobjc

Notice /usr/local/lib is first. Searching for installed versions of icu4c in this path it finds the v8 installed icuuc first:

/usr/local/lib/libicuuc.dylib@
➜  2.4.2 git:(master) ✗ otool -L /usr/local/lib/libicuuc.dylib
/usr/local/lib/libicuuc.dylib:
        /usr/local/opt/v8/lib/libicuuc.dylib (compatibility version 0.0.0, current version 0.0.0)
        /usr/lib/libc++.1.dylib (compatibility version 1.0.0, current version 400.9.4)
        /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1252.200.5)```

I think this is caused by the order in which we search for the linked libraries. If you swap the following found here: https://github.com/edporras/edn_turbo/blob/master/ext/edn_turbo/extconf.rb#L12-L13

lib_dirs = [
  '/usr/local/lib', # must be the first entry; add others after it
  '/usr/local/opt/icu4c/lib'
].freeze

to

lib_dirs = [
  '/usr/local/opt/icu4c/lib',
  '/usr/local/lib', # must be the first entry; add others after it
].freeze

It links correctly. For me at least.

Error: uninitialized constant Bignum (NameError)

Related to: relevance/edn-ruby#39

The edn gem dependency relies on the Ruby Bignum class. This has been deprecated for some time and has now been removed from Ruby 3.2.

Loading edn_turbo now results in an error. Is there any possible workaround?

% ruby --version
ruby 3.2.1 (2023-02-08 revision 31819e82c8) [arm64-darwin22]

irb(main):001:0> require 'edn_turbo'
/opt/homebrew/lib/ruby/gems/3.2.0/gems/edn-1.1.1/lib/edn/core_ext.rb:97:in `<top (required)>': uninitialized constant Bignum (NameError)

Bignum.send(:include, EDN::CoreExt::Bignum)
^^^^^^
	from <internal:/opt/homebrew/lib/ruby/site_ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:88:in `require'
	from <internal:/opt/homebrew/lib/ruby/site_ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:88:in `require'
	from /opt/homebrew/lib/ruby/gems/3.2.0/gems/edn-1.1.1/lib/edn.rb:4:in `<top (required)>'
	from <internal:/opt/homebrew/lib/ruby/site_ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:88:in `require'
	from <internal:/opt/homebrew/lib/ruby/site_ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:88:in `require'
	from /opt/homebrew/lib/ruby/gems/3.2.0/gems/edn_turbo-0.7.4/lib/edn_turbo.rb:25:in `<top (required)>'
	from <internal:/opt/homebrew/lib/ruby/site_ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:162:in `require'
	from <internal:/opt/homebrew/lib/ruby/site_ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:162:in `rescue in require'
	from <internal:/opt/homebrew/lib/ruby/site_ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:152:in `require'
	from (irb):1:in `<main>'
	from /opt/homebrew/Cellar/ruby/3.2.1/lib/ruby/gems/3.2.0/gems/irb-1.6.2/exe/irb:11:in `<top (required)>'
	from /opt/homebrew/opt/ruby/bin/irb:25:in `load'
	from /opt/homebrew/opt/ruby/bin/irb:25:in `<main>'
<internal:/opt/homebrew/lib/ruby/site_ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:88:in `require': cannot load such file -- edn_turbo (LoadError)
	from <internal:/opt/homebrew/lib/ruby/site_ruby/3.2.0/rubygems/core_ext/kernel_require.rb>:88:in `require'
	from (irb):1:in `<main>'
	from /opt/homebrew/Cellar/ruby/3.2.1/lib/ruby/gems/3.2.0/gems/irb-1.6.2/exe/irb:11:in `<top (required)>'
	from /opt/homebrew/opt/ruby/bin/irb:25:in `load'
	from /opt/homebrew/opt/ruby/bin/irb:25:in `<main>'

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.