serde-rs / json-benchmark Goto Github PK

View Code? Open in Web Editor NEW

169.0 14.0 23.0 1.18 MB

nativejson-benchmark in Rust

License: Apache License 2.0

Rust 13.74% Makefile 0.02% C++ 82.88% C 3.36%

benchmark json rust serde

json-benchmark's People

Contributors

Stargazers

Watchers

json-benchmark's Issues

Remove json-rust from benchmark?

In an attempt to try to find a good JSON DOM library that supports borrowing, I came across this benchmark and almost used the json crate. I found it odd that the crates.io listing has no README, yet it's such a popular crate and performs quite well!

Then I saw on github that it isn't maintained. Not a problem if it's done with development, as a JSON library certainly could be. But another issue points out a soundness issue that Miri also doesn't like.

Given that it's unmaintained and potentially unsound, I'm not sure how valid of a comparison it actually provides in today's JSON ecosystem. In fact, I worry that the mere presence of it on this list might be leading some developers to adopt the crate despite, in my opinion, it not being a safe crate to adopt given the open issues.

Update for Serde 1.0

README update

Should I update it from my own machine if I do stuff?

Results I'm getting now:

     Running `target/release/json-benchmark`
                                DOM                STRUCT
======= serde_json ======= parse|stringify === parse|stringify ===
data/canada.json          51.8ms    23.9ms    36.9ms    20.0ms
data/citm_catalog.json    30.2ms     3.3ms    20.6ms     1.7ms
data/twitter.json         10.3ms     1.4ms     8.4ms     1.4ms

======= json-rust ======== parse|stringify === parse|stringify ===
data/canada.json          42.0ms    42.3ms
data/citm_catalog.json    16.3ms     2.0ms
data/twitter.json          6.7ms     1.1ms

==== rustc_serialize ===== parse|stringify === parse|stringify ===
data/canada.json          39.7ms    68.8ms    48.0ms    65.0ms
data/citm_catalog.json    26.8ms     6.1ms    32.7ms     4.1ms
data/twitter.json         13.4ms     2.8ms    17.9ms     2.6ms

This is using a branch of json-rust that can directly write to io::Write types. The performance is somewhat aligned, although the difference is smaller (and serde_json writing to a struct is faster).

It is accounted for by the fact that io::Write doesn't implement a method that allows to write a single u8, so all single token writes have to be wrapped into a &[u8] slice which creates an unnecessary level of indirection.

Update simd-json link

The project now lives at https://github.com/simd-lite/simd-json

Add a note regarding conformance?

I did make an issue about it on the original nativejson-benchmark, thus far unanswered.

The conformance tests of the nativejson-benchmark aren't actually testing conformance with ECMA or RFC, despite what the README says. Taking the roundtrip tests:

roundtrip10.json will fail on any implementation that doesn't preserve ordering of the keys, not required by the standard (I pass it by the virtue of the keys being in alphabetical order).
roundtrip13.json, roundtrip14.json, roundtrip18.json and roundtrip19.json all require higher precision than IEEE 754 float64, this is not required by the standard.
roundtrip20.json requires that a floating point zero is different from an integer zero, and is represented with a fraction, this is not required by the standard.
roundtrip21.json again requires different representation of floating zero, and requires that negative zero is preserved, not requirements of the standard.
roundtrip24.json, roundtrip25.json, roundtrip26.json and roundtrip27.json requires that e notation is written with a lower case e and that positive exponents are written without a plus. Any implementation that writes capital E and/or adds a + to positive exponents would fail this.

Web browsers and Node.js would fail 7 of those tests, while being perfectly within the standard.

I don't know if my original issue is going to be resolved, but should we add a note that this is a conformance test as it is ported from nativejson-benchmark and is not actually testing conformance to JSON standard as it is?

Benchmark of tokenized / streaming APIs

The rustc_serialize library has a streaming / token-based parser, which can be used in advanced applications to implement very efficient json extraction while keeping the memory overhead of the parsing low; this is great for large documents that may not easily fit into RAM. It would be awesome to see benchmarks for this style of API.

update simd-json.rs version

@sunnygleason and me have put quite some work into improving simd-json.rs performance it would be great if the numbers could be updated with the current version of simd-json.rs.

Many thanks in advance!

Use zero-copy deserialization with Serde

Where possible.

Profile-Guided Optimization (PGO) results on parsing JSON files

Hi!

I am doing research about Profile-Guided Optimization (PGO) effects on different kinds of software - my current results are available here. I decided to PGO-optimize json-benchmark and try to estimate the PGO performance boost for that kind of workload.

My test environment is Macbook M1 Pro, Ventura 13.4, with a connected charger (it's important since it affects the results). The results are collected with multiple runs of cargo run --release and choosing the greater results (since the same way is described in the README file). PGO optimization was done with cargo-pgo. My results are the following.

Release:

======= serde_json ======= parse|stringify ===== parse|stringify ====
data/canada.json         440 MB/s   710 MB/s   650 MB/s   510 MB/s
data/citm_catalog.json   830 MB/s   940 MB/s  1400 MB/s  1210 MB/s
data/twitter.json        510 MB/s  1360 MB/s   970 MB/s  1490 MB/s

==== rustc_serialize ===== parse|stringify ===== parse|stringify ====
data/canada.json         250 MB/s   140 MB/s   210 MB/s   100 MB/s
data/citm_catalog.json   350 MB/s   390 MB/s   270 MB/s   470 MB/s
data/twitter.json        200 MB/s   600 MB/s   150 MB/s   670 MB/s

======= simd-json ======== parse|stringify ===== parse|stringify ====
data/canada.json         710 MB/s   770 MB/s   890 MB/s
data/citm_catalog.json  1890 MB/s  1070 MB/s  2290 MB/s
data/twitter.json       1820 MB/s  1630 MB/s  1420 MB/s

PGO-optimized:

======= serde_json ======= parse|stringify ===== parse|stringify ====
data/canada.json         590 MB/s   720 MB/s  1150 MB/s   530 MB/s
data/citm_catalog.json  1100 MB/s  1090 MB/s  1620 MB/s  1210 MB/s
data/twitter.json        600 MB/s  1370 MB/s   990 MB/s  1370 MB/s

==== rustc_serialize ===== parse|stringify ===== parse|stringify ====
data/canada.json         280 MB/s   150 MB/s   230 MB/s   110 MB/s
data/citm_catalog.json   450 MB/s   470 MB/s   340 MB/s   530 MB/s
data/twitter.json        220 MB/s   740 MB/s   170 MB/s   780 MB/s

======= simd-json ======== parse|stringify ===== parse|stringify ====
data/canada.json         740 MB/s   780 MB/s   990 MB/s
data/citm_catalog.json  1900 MB/s  1190 MB/s  2520 MB/s
data/twitter.json       1900 MB/s  1700 MB/s  1620 MB/s

According to these tests, PGO allows us to achieve better performance with parsing JSON files. Hope my tests can help someone. Probably would be a good idea to add the information about PGO to the libraries' documentation somewhere in a "Performance tuning" section.

json-rust 0.9.0rc

Pushed to benches, just need to cargo update and rebuild. My hardware:

======= serde_json ======= parse|stringify === parse|stringify ===
data/canada.json          24.7ms    23.2ms    12.5ms    19.1ms
data/citm_catalog.json    17.3ms     3.3ms     7.4ms     1.6ms
data/twitter.json          7.1ms     1.4ms     4.8ms     1.4ms

======= json-rust ======== parse|stringify === parse|stringify ===
data/canada.json          19.3ms    17.1ms
data/citm_catalog.json    10.7ms     1.4ms
data/twitter.json          4.1ms     1.0ms

==== rustc_serialize ===== parse|stringify === parse|stringify ===
data/canada.json          38.8ms    65.9ms    46.9ms    65.4ms
data/citm_catalog.json    26.2ms     6.0ms    34.4ms     4.4ms
data/twitter.json         13.3ms     2.7ms    18.5ms     2.5ms

Automagical graphs

So, I've the json.rs domain at my disposal. If this benchmark can be made to dump a JSON (duh) file with the results of the benchmark, which then could be uploaded to github, I can put up some simple webpage on that address that grabs the JSON from the repo and draws the graphs for it in JavaScript. That way we can have something nice to present, without having to do much of manual updates every time, just build -> run -> commit and push.

How does that sound?

Edit: Actually given the formatting of the log, I might be able to visualize it by scraping the readme.