Giter Club home page Giter Club logo

yyjsonr's Introduction

yyjsonr

yyjsonr

R-CMD-check

{yyjsonr} is a fast JSON parser/serializer, which converts R data to/from JSON.

In most cases it is around 2x to 10x faster than {jsonlite} at both reading and writing JSON.

It is a wrapper for the yyjson C library (v0.8.0). yysjon is MIT licensed - see LICENSE-yyjson.txt in this package for more details.

What’s in the box

This package contains specialised functions for each type of operation (read/write/validate) and the storage location of the JSON (string/file/raw vector/connection).

The matrix of available operations and storage is shown below:

string file raw conn options
read read_json_str() read_json_file() read_json_raw() read_json_conn() opts_read_json()
write write_json_str() write_json_file() opts_write_json()
validate validate_json_str() validate_json_file()

Comparison to other packages with read/write JSON

Write JSON Read JSON
yyjsonr Fast! Fast!
jsonlite Yes Yes
jsonify Yes Yes

Note: Benchmarks were run on Apple M2 Mac. See file “man/benchmark/benchmark.Rmd” for details.

Installation

You can install from GitHub with:

# install.package('remotes')
remotes::install_github('coolbutuseless/yyjsonr')

Simple usage example

library(yyjsonr)

str <- write_json_str(head(iris, 3), pretty = TRUE)
cat(str)
#> [
#>   {
#>     "Sepal.Length": 5.1,
#>     "Sepal.Width": 3.5,
#>     "Petal.Length": 1.4,
#>     "Petal.Width": 0.2,
#>     "Species": "setosa"
#>   },
#>   {
#>     "Sepal.Length": 4.9,
#>     "Sepal.Width": 3.0,
#>     "Petal.Length": 1.4,
#>     "Petal.Width": 0.2,
#>     "Species": "setosa"
#>   },
#>   {
#>     "Sepal.Length": 4.7,
#>     "Sepal.Width": 3.2,
#>     "Petal.Length": 1.3,
#>     "Petal.Width": 0.2,
#>     "Species": "setosa"
#>   }
#> ]

read_json_str(str)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          5.1         3.5          1.4         0.2  setosa
#> 2          4.9         3.0          1.4         0.2  setosa
#> 3          4.7         3.2          1.3         0.2  setosa

Future

  • Re-introduce NDJSON support
    • NDJSON support was removed for the initial CRAN release for the sake of my sanity.
    • See the ndjson branch of this repository
  • Re-introduce GeoJSON support
    • GeoJSON support was removed for the initial CRAN release for the sake of my sanity.
    • See the geojson branch of this repository

Limitations

  • Some datatypes not currently supported. Please file an issue on GitHub if these types are critical for you. Providing test cases also appreciated!:
    • Complex numbers
    • POSIXlt
    • Matrices of POSIXct / Date

Acknowledgements

  • R Core for developing and maintaining the language.
  • CRAN maintainers, for patiently shepherding packages onto CRAN and maintaining the repository

yyjsonr's People

Contributors

coolbutuseless avatar hrbrmstr avatar shikokuchuo avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

yyjsonr's Issues

Support closures

plotly internally uses jsonlite to convert objects to JSON, and wants to serialize closures

  • [] figure out what {jsonlite} is doing with closures in general
  • [] figure out what {plotly} is using closures for
  • [] clone it!

CRAN release?

Any chance you plan on making a CRAN release? I'd like to give it go :)

Possible bug: vectors_to_df = FALSE still returns a dataframe

Not sure if this is a bug or intended behaviour. However it seems like vectors_to_df is ignored when set to FALSE.

obj <- yyjsonr::write_json_str(head(iris, 5), auto_unbox = T)
yyjsonr::read_json_str(obj, vectors_to_df = FALSE)
#>   Sepal.Length Sepal.Width Petal.Length Petal.Width Species
#> 1          5.1         3.5          1.4         0.2  setosa
#> 2          4.9         3.0          1.4         0.2  setosa
#> 3          4.7         3.2          1.3         0.2  setosa
#> 4          4.6         3.1          1.5         0.2  setosa
#> 5          5.0         3.6          1.4         0.2  setosa

Created on 2023-09-13 with reprex v2.0.2

I was expecting the following behaviour.

obj <- yyjsonr::write_json_str(head(iris, 5), auto_unbox = T)
jsonlite::fromJSON(obj, simplifyDataFrame = F)
#> [[1]]
#> [[1]]$Sepal.Length
#> [1] 5.1
#> 
#> [[1]]$Sepal.Width
#> [1] 3.5
#> 
#> [[1]]$Petal.Length
#> [1] 1.4
#> 
#> [[1]]$Petal.Width
#> [1] 0.2
#> 
#> [[1]]$Species
#> [1] "setosa"
#> 
#> 
#> [[2]]
#> [[2]]$Sepal.Length
#> [1] 4.9
#> 
#> [[2]]$Sepal.Width
#> [1] 3
#> 
#> [[2]]$Petal.Length
#> [1] 1.4
#> 
#> [[2]]$Petal.Width
#> [1] 0.2
#> 
#> [[2]]$Species
#> [1] "setosa"
#> 
#> 
#> [[3]]
#> [[3]]$Sepal.Length
#> [1] 4.7
#> 
#> [[3]]$Sepal.Width
#> [1] 3.2
#> 
#> [[3]]$Petal.Length
#> [1] 1.3
#> 
#> [[3]]$Petal.Width
#> [1] 0.2
#> 
#> [[3]]$Species
#> [1] "setosa"
#> 
#> 
#> [[4]]
#> [[4]]$Sepal.Length
#> [1] 4.6
#> 
#> [[4]]$Sepal.Width
#> [1] 3.1
#> 
#> [[4]]$Petal.Length
#> [1] 1.5
#> 
#> [[4]]$Petal.Width
#> [1] 0.2
#> 
#> [[4]]$Species
#> [1] "setosa"
#> 
#> 
#> [[5]]
#> [[5]]$Sepal.Length
#> [1] 5
#> 
#> [[5]]$Sepal.Width
#> [1] 3.6
#> 
#> [[5]]$Petal.Length
#> [1] 1.4
#> 
#> [[5]]$Petal.Width
#> [1] 0.2
#> 
#> [[5]]$Species
#> [1] "setosa"

Created on 2023-09-13 with reprex v2.0.2

> sessioninfo::session_info()
─ Session info ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 setting  value
 version  R version 4.3.1 (2023-06-16)
 os       macOS Ventura 13.4
 system   aarch64, darwin20
 ui       RStudio
 language (EN)
 collate  en_US.UTF-8
 ctype    en_US.UTF-8
 tz       Europe/London
 date     2023-09-13
 rstudio  2023.06.2+561 Mountain Hydrangea (desktop)
 pandoc   3.1.7 @ /opt/homebrew/bin/ (via rmarkdown)

─ Packages ────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
 package     * version date (UTC) lib source
 callr         3.7.3   2022-11-02 [1] CRAN (R 4.3.0)
 cli           3.6.1   2023-03-23 [1] CRAN (R 4.3.0)
 clipr         0.8.0   2022-02-22 [1] CRAN (R 4.3.0)
 digest        0.6.33  2023-07-07 [1] CRAN (R 4.3.0)
 evaluate      0.21    2023-05-05 [1] CRAN (R 4.3.0)
 fansi         1.0.4   2023-01-22 [1] CRAN (R 4.3.0)
 fastmap       1.1.1   2023-02-24 [1] CRAN (R 4.3.0)
 fs            1.6.3   2023-07-20 [1] CRAN (R 4.3.0)
 glue          1.6.2   2022-02-24 [1] CRAN (R 4.3.0)
 htmltools     0.5.6   2023-08-10 [1] CRAN (R 4.3.0)
 jsonlite      1.8.7   2023-06-29 [1] CRAN (R 4.3.0)
 knitr         1.43    2023-05-25 [1] CRAN (R 4.3.0)
 lifecycle     1.0.3   2022-10-07 [1] CRAN (R 4.3.0)
 magrittr      2.0.3   2022-03-30 [1] CRAN (R 4.3.0)
 pillar        1.9.0   2023-03-22 [1] CRAN (R 4.3.0)
 pkgconfig     2.0.3   2019-09-22 [1] CRAN (R 4.3.0)
 processx      3.8.2   2023-06-30 [1] CRAN (R 4.3.0)
 ps            1.7.5   2023-04-18 [1] CRAN (R 4.3.0)
 purrr         1.0.2   2023-08-10 [1] CRAN (R 4.3.0)
 R.cache       0.16.0  2022-07-21 [1] CRAN (R 4.3.0)
 R.methodsS3   1.8.2   2022-06-13 [1] CRAN (R 4.3.0)
 R.oo          1.25.0  2022-06-12 [1] CRAN (R 4.3.0)
 R.utils       2.12.2  2022-11-11 [1] CRAN (R 4.3.0)
 R6            2.5.1   2021-08-19 [1] CRAN (R 4.3.0)
 reprex        2.0.2   2022-08-17 [1] CRAN (R 4.3.0)
 rlang         1.1.1   2023-04-28 [1] CRAN (R 4.3.0)
 rmarkdown     2.24    2023-08-14 [1] CRAN (R 4.3.0)
 rstudioapi    0.15.0  2023-07-07 [1] CRAN (R 4.3.0)
 sessioninfo   1.2.2   2021-12-06 [1] CRAN (R 4.3.0)
 styler        1.10.2  2023-08-29 [1] CRAN (R 4.3.0)
 tibble        3.2.1   2023-03-20 [1] CRAN (R 4.3.0)
 utf8          1.2.3   2023-01-31 [1] CRAN (R 4.3.0)
 vctrs         0.6.3   2023-06-14 [1] CRAN (R 4.3.0)
 withr         2.5.0   2022-03-03 [1] CRAN (R 4.3.0)
 xfun          0.40    2023-08-09 [1] CRAN (R 4.3.0)
 yaml          2.3.7   2023-01-23 [1] CRAN (R 4.3.0)
 yyjsonr       0.1.9   2023-09-13 [1] Github (coolbutuseless/yyjsonr@9b99579)

 [1] /Users/dyfanjones/Library/R/arm64/4.3/library
 [2] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library

───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

Feature request: digit support

It would be helpful for yyjsonr to support a digit/precision parameter to indicate how numeric values come out in the JSON output. Currently you can do this by ensuring the field type is integer, but this can be messy in nested data frames. Ideally something like
digits = 0 (integer)
digits = 1 (decimal with one place)
etc.

Alternatively, for my use case, I really just need numeric to come out as integer values, so an option to force numbers to integers would work for me. eg,
force_int = TRUE

Support writing data.frames without column names

jsonlite writes an data.frame with column names as a JSON []-array of JSON {}-objects (with column names as keys)

If they data.frame is unnamed then it is written as nested JSON []-arrays.

> aa <- data.frame(x = 1:2, y = c('yes', 'no'))
> jsonlite::toJSON(aa)
[{"x":1,"y":"yes"},{"x":2,"y":"no"}] 
> colnames(aa) <- NULL
> jsonlite::toJSON(aa)
[[1,"yes"],[2,"no"]]

yyjsonr does not currently support data.frames without column names, and methods for producing the nested array format are probably a bit obtuse e.g.

write_json_str(purrr::transpose(aa), auto_unbox = TRUE)

This nested json []-array format with heterogeneous types is used internally within plotly

Explicit scalars

Could we add support for explicitly treating a vector as scalar? This is the reverse of auto_unbox = TRUE and opting out with I(); instead everything is vector and we explicitly decorate a length-1 vector as scalar. This also gives us the option of throwing errors when our "scalar" isn't length-1.

Example R to JSON:

Input

list(my_scalar = "foo", my_vector = "bar")

Output

{"my_scalar":"foo","my_vector":["bar"]}

{jsonlite}'s approach to this is to decorate scalars with a class ("scalar").

jsonlite::toJSON(list(my_scalar = jsonlite::unbox("foo"), my_vector = "bar"))

General jsonlite compatibility issues

{yyjsonr} has a "soft" goal to try and be compatible with {jsonlite} in the most common use cases.

Full compatibility with {jsonlite} is a non-goal.

This is a place to voice any compatibility issue you have, or express support/disagreement for things on the list below. This will help prioritise programming effort.

  • Borrow (with permission!) the jsonlite test suite
  • Add a digits argument
  • promote_num_to_string - Option to promote numerics to strings if an array has a mixture of numbers and strings. set to TRUE for jsonlite compatibility.
  • Add 'json' class to match jsonlite's behaviour. Optional, and set to FALSE. Set to TRUE for jsonlite compatibility
  • Read/write arrays like jsonlite. Maybe.
  • a POSIXt = c('string', 'epoch', 'ISO8601') option like jsonlite::toJSON()

read_ndjson_file fails to parse long-ish lines

out.txt

## yyjsonr version 0.1.11

This fails on the attached file:

## Im5zMTpbTGphdmEubGFu
## ^
##   Error in yyjsonr::read_ndjson_file("~/projects/gncap/out.txt") : 
##   Couldn't parse JSON on line 211

It's also off by one line (line 212 is the error line)

This works:

stringi::stri_read_lines("~/projects/gncap/out.txt") |> 
  lapply(yyjsonr::read_json_str) |> 
  data.table::rbindlist(fill = TRUE) -> xdf

str(xdf)
## Classes ‘data.table’ and 'data.frame':	749 obs. of  35 variables:
##  $ ip                         : chr  "90.169.110.161" "18.119.179.222" "45.128.232.140" "93.174.95.106" ...
##  $ Accept                     : chr  "text/plain,text/html" NA NA "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8" ...
##  $ Host                       : chr  "116.202.22.148" "116.202.22.148" "google.com:443" "116.202.22.148" ...
##  $ Method                     : chr  "GET" "GET" "CONNECT" "GET" ...
##  $ Path                       : chr  "/" "/.git/credentials" "google.com:443" "/" ...
##  $ Http_Version               : chr  "HTTP/1.0" "HTTP/1.1" "HTTP/1.1" "HTTP/1.1" ...
##  $ Accept_Charset             : chr  NA "utf-8" NA NA ...
##  $ Accept_Encoding            : chr  NA "gzip" NA "identity" ...
##  $ Connection                 : chr  NA "close" NA NA ...
##  $ User_Agent                 : chr  NA "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_15_7) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.0 Safari/605.1.15" "Go-http-client/1.1" "Mozilla/5.0 (Windows NT 6.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2228.0 Safari/537.36" ...
##  $ Accept_Language            : chr  NA NA NA NA ...
##  $ Authorization              : chr  NA NA NA NA ...
##  $ Upgrade_Insecure_Requests  : chr  NA NA NA NA ...
##  $ Content_Type               : chr  NA NA NA NA ...
##  $ X_Requested_With           : chr  NA NA NA NA ...
##  $ Content_Length             : chr  NA NA NA NA ...
##  $ POST_Payload               : chr  NA NA NA NA ...
##  $ Cache_Control              : chr  NA NA NA NA ...
##  $ DNT                        : chr  NA NA NA NA ...
##  $ Sec-Gpc                    : chr  NA NA NA NA ...
##  $ x-datadog-trace-id         : chr  NA NA NA NA ...
##  $ x-datadog-parent-id        : chr  NA NA NA NA ...
##  $ x-datadog-sampling-priority: chr  NA NA NA NA ...
##  $ Origin                     : chr  NA NA NA NA ...
##  $ Referer                    : chr  NA NA NA NA ...
##  $ Cookie                     : chr  NA NA NA NA ...
##  $ X-Test-Token               : chr  NA NA NA NA ...
##  $ Pragma                     : chr  NA NA NA NA ...
##  $ SOAPAction                 : chr  NA NA NA NA ...
##  $ Proxy_Connection           : chr  NA NA NA NA ...
##  $ BS_REAL_IP                 : chr  NA NA NA NA ...
##  $ Range                      : chr  NA NA NA NA ...
##  $ http-accept                : chr  NA NA NA NA ...
##  $ X-User-Agent               : chr  NA NA NA NA ...
##  $ X-Use-Agent                : chr  NA NA NA NA ...
##  - attr(*, ".internal.selfref")=<externalptr>

FR: support json.gz

As always, great package!

It would be nice if compressed files, e.g. json.gz, were handled automatically by read_json_file().

Feature Request: Unboxing nuance

This looks like a great package! One thing that annoys me about working with JSON in R is the inability to distinguish between scalars and length-1 vectors. I often find myself working with APIs that are picky about the difference.

Right now, I think yyjsonr's auto_unbox behavior is all-or-nothing. jsonlite has a nice feature where you can override auto_unbox by using R's as-is class (I()).

What jsonlite does not do (and as far as I know no public R package does) is annotate the differences between a scalar and length 1 vector at read time (via some class or other attribute metadata). This means that to round trip a JSON object, I have to manually go through the R object and add the as-is class (with pre-knowledge about where it belongs).

Does this fit in with the vision of yyjsonr, or would it sacrifice too much speed? If I find time, would you consider a PR?

# both jsonlite and yyjsonr can unbox none or all length 1 vectors
list(a = 1L, b = 1L) |> jsonlite::toJSON()
#> {"a":[1],"b":[1]} 
list(a = 1L, b = 2L) |> yyjsonr::write_json_str() |> cat()
#> {"a":[1],"b":[1]} 

list(a = 1L, b = 2L) |>  yyjsonr::write_json_str(auto_unbox = TRUE) |> cat()
#> {"a":1,"b":1} 
list(a = 1L, b = 1L) |> jsonlite::toJSON(auto_unbox = TRUE)
#> {"a":1,"b":1} 


# jsonlite has a way to override auto-unboxing using `I()` (the as-is class):
list(a = 1L, b = I(2L)) |> jsonlite::toJSON(auto_unbox = TRUE)
#> {"a":1,"b":[1]} 

# I think yyjsonr does not (yet?):
list(a = 1L, b = I(2L)) |>  yyjsonr::write_json_str(auto_unbox = TRUE) |> cat()
#> {"a":1,"b":2}


# In addition, I think yyjsonr could improve upon jsonlite's API because when trying to roundtrip
# it requires information that's not possible to know from its API (and can be tedious to add)
set_boxes <- \(x, names) modifyList(x, lapply(x[names], I))

'{"a": 1, "b": [2], "c": [3, 4]}' |> 
    jsonlite::fromJSON() |> 
    set_boxes("b") |> 
    jsonlite::toJSON(auto_unbox = TRUE)
#> {"a":1,"b":[2],"c":[3,4]} 



# What I hope yyjsonr could do (proposed new option named `box_singletons`?)
'{"a": 1, "b": [2], "c": [3, 4]}' |> 
    yyjsonr::read_json(box_singletons = TRUE) |> 
    str()
#> List of 3
#>  $ a: int 1
#>  $ b: 'AsIs' int 2
#>  $ c: int [1:2] 3 4


# And then I'd be able to round trip:
'{"a": 1, "b": [2], "c": [3, 4]}' |> 
    yyjsonr::read_json(box_singletons = TRUE) |> 
    yyjsonr::write_json_str(auto_unbox = TRUE) |> 
    cat()
#> {"a":1,"b":[2],"c":[3,4]}

Also it occurs to me that while I'm used to using I() to prevent auto_unboxing, jsonlite actually recommends using explicitly jsonlite::unbox() to add scalar class to scalars. I don't have a preference for which way it should work in yyjsonr (or if both should be supported).

set_unboxes <- \(x, names) modifyList(x, lapply(x[names], jsonlite::unbox))

'{"a": 1, "b": [2], "c": [3, 4]}' |> 
    jsonlite::fromJSON() |> 
    set_unboxes("a") |> 
    jsonlite::toJSON()
 #> {"a":1,"b":[2],"c":[3,4]}

Skip JSON serialisation if already JSON (json_verbatim)

Could we optionally skip json serialisation of character vectors that are already json? This is {jsonlite}'s json_verbatim = TRUE option. Example:

my_obj <- list(already_json = structure('{"leave_me_alone":"yes"}', class = "json"))
yyjsonr::write_json_str(my_obj, list(skip_json = TRUE))

Expected output

{"already_json":{"leave_me_alone":"yes"}}

Revisit naming and creation of options.

  • Do a 'language' pass to find better names for options
  • (Simultaneously) Do a 'documentation' pass to clearly explain the option and when you might need it.
  • use same option names in R and C for the sake of sanity.

Parsing raw data

Thanks for this! Definitely as cool as it gets for 'json' data.

From my perspective, the only real comparison is with RcppSimdJson. It's nice to say it's much faster than jsonlite but it's probably the other reasons why people continue to use it (proven track record, good enough etc.)

From a programming perspective, I'd like this package to be as predictable (and fast) as possible in terms of return types rather than trying to provide heuristic guesses. I think this also fits in with the philosophy of the 'yyjson' library.

Quick benchmarking shows that on small data (tens of microseconds to parse), RcppSimdJson slightly faster, whilst yyjsonr is faster on larger data (hundreds of microseconds). This is already very impressive indeed!

Just an initial query at this point - is it possible to parse a raw instead of character vector? RcppSimdJson is the same speed on either, maybe even slightly faster on raw. It is nice as it means the incoming data does not need converting in the first place - the rough equivalent of a rawToChar() operation.

Example:

a <- nanonext::ncurl("https://postman-echo.com/get", convert = FALSE)
a
RcppSimdJson::fparse(a[["raw"]])

b <- nanonext::ncurl("https://postman-echo.com/get", convert = TRUE)
b
yyjsonr::from_json_str(b[["data"]])

GeoJSON support

After first CRAN release, re-introduce

  • to_geojson_str()
  • from_geojson_str()

Feature: Pre-serialise s3 generic for object transform

An S3 generic to perform object transformations prior to json serialise will give package authors full control of their object json serialisation. This is the same concept as the JavaScript toJSON() method on objects.

Some use cases:

  • Adding / removing elements from lists
  • Convert element names to camelCase
  • Formatting dates, rounding numbers, etc
  • Applying AsIs to elements that must be scalars

A naïve example implementation:

yyjson_mut_val* any_serialise_function(SEXP object) {
  // should be global
  SEXP to_json = PROTECT(
    Rf_findFun(
      Rf_install("to_json"),
      Rf_findVarInFrame(R_NamespaceRegistry, Rf_install("yyjsonr"))
    )
  );

  SEXP trans_object = PROTECT(
    Rf_eval(Rf_lang2(to_json, object), R_GlobalEnv)
  );

  // serialise trans_object

  UNPROTECT(2); // 1 if to_json is global
  // return the yyjson_mut_val*
}
#' @export
to_json <- function(object, ...) UseMethod("to_json")

# A real use-case would be to add AsIs to scalars, drop items that aren't required, camelCase property names etc.
# but complete object replacement is possible
#' @export
to_json.foobar <- function(object, ...) list(foo = "bar")
foobar <- structure(list(whatever = "doesn't matter"), class = "foobar")
yyjsonr::write_json_str(foobar)
#> [1] "{\"foo\":[\"bar\"]}"

Created on 2023-10-09 with reprex v2.0.2

Overhead? Yes there is

Executing an R method for each object in the tree has some overhead. In prototyping, I observed around 1.5secs of overhead per 1 000 000 to_json() dispatches. If we assume that most objects won't have a to_json() method, we can avoid much of this overhead by skipping the to_json.default() call.

One approach is to cache the classes implementing to_json() and only dispatch if our input object inherits any of these classes.

E.g.

for (/* to_json classes */) {
  if (!Rf_inherits(object, classes[i]) continue;
  // invoke to_json()
  break;
}

In prototyping, with only one s3 method to check, I found this reduced the overhead of serialising types without a to_json() method to approx 20ms per 1 000 000 objects.

my_list <- rep(
  list(structure(list(whatever = "doesn't matter"), class = "not_foobar")),
  1000000
)

bench::mark(
  yyjsonr::write_json_str(my_list),
  iterations = 10,
  check = FALSE
)
#> # A tibble: 1 × 6
#>   expression                            min  median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                       <bch:tm> <bch:t>     <dbl> <bch:byt>    <dbl>
#> 1 yyjsonr::write_json_str(my_list)    268ms   270ms      3.70    30.5MB        0

Created on 2023-10-09 with reprex v2.0.2

With to_json()

my_list <- rep(
  list(structure(list(whatever = "doesn't matter"), class = "not_foobar")),
  1000000
)

bench::mark(
  yyjsonr::write_json_str(my_list),
  iterations = 10,
  check = FALSE
)
#> # A tibble: 1 × 6
#>   expression                            min  median `itr/sec` mem_alloc `gc/sec`
#>   <bch:expr>                       <bch:tm> <bch:t>     <dbl> <bch:byt>    <dbl>
#> 1 yyjsonr::write_json_str(my_list)    285ms   286ms      3.49    30.5MB        0

Created on 2023-10-09 with reprex v2.0.2

devel R repository

Package could benefit for extra testing (in downstream deps) if devel R repo will be provided. Aside from easier installations obviously.

If there is a will to provide one, then I can submit PR for GHA to publish devel R repo.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.