Conversions between 'R' objects and Javascript Object Notation (JSON) using the 'rapidjsonr' library < https://CRAN.R-project.org/package=rapidjsonr>.
jsonify converts R objects to JSON.
Because I wanted it available at the source ( C++ ) level for integrating into other packages.
yeah it’s pretty good.
library(microbenchmark)library(jsonlite)n <- 1e6df <- data.frame(id = 1:n, value = sample(letters, size = n, replace = T), val2 = rnorm(n = n), log = sample(c(T,F), size = n, replace = T), stringsAsFactors = FALSE)microbenchmark(jsonlite = {js <- jsonlite::toJSON( df )},jsonify = {js <- jsonify::to_json( df )},times = 3)# Unit: seconds# expr min lq mean median uq max neval# jsonlite 2.017081 2.063732 2.540350 2.110383 2.801984 3.493585 3# jsonify 1.186239 1.202719 1.514067 1.219198 1.677981 2.136763 3n <- 1e4x <- list(x = rnorm(n = n), y = list(x = rnorm(n = n)), z = list( list( x = rnorm(n = n))), xx = rnorm(n = n), yy = data.frame(id = 1:n, value = sample(letters, size = n, replace = T), val2 = rnorm(n = n), log = sample(c(T,F), size = n, replace = T)))microbenchmark(jsonlite = {js <- jsonlite::toJSON( x )},jsonify = {js <- jsonify::to_json( x )},times = 5)# Unit: milliseconds# expr min lq mean median uq max neval# jsonlite 18.52028 18.82241 19.32112 18.99683 19.18103 21.08508 5# jsonify 17.72060 18.19092 19.58308 19.52457 21.14687 21.33241 5
jsonlite
?Yeah, but I realised it was comparing two different methods. When
jsonify
was parsing nested lists, it was parsing data.frames
column-wise, whereas jsonlite was row-wise. Which is a slower operation
Correct.
Here’s a more suitable comparison
n <- 1e4x <- list(x = rnorm(n = n), y = list(x = rnorm(n = n)), z = list( list( x = rnorm(n = n))), xx = rnorm(n = n), yy = data.frame(id = 1:n, value = sample(letters, size = n, replace = T), val2 = rnorm(n = n), log = sample(c(T,F), size = n, replace = T)))microbenchmark(jsonlite_row = {js <- jsonlite::toJSON( x )},jsonlite_col = {js <- jsonlite::toJSON( x, dataframe = "columns" )},jsonify_row = {js <- jsonify::to_json( x )},jsonify_col = {js <- jsonify::to_json( x, by = "column" )},times = 5)# Unit: milliseconds# expr min lq mean median uq max neval# jsonlite_row 20.533642 20.717894 27.294220 21.122860 21.426250 52.670456 5# jsonlite_col 13.691643 13.812459 15.683795 14.293177 15.655705 20.965993 5# jsonify_row 17.506507 17.951948 20.929641 19.827791 21.161389 28.200572 5# jsonify_col 7.262305 7.382238 7.409085 7.434759 7.435476 7.530645 5
Date
type in JSON, how have you handled this?At its core Dates
in R are numeric, so they are treated as numbers
when converted to JSON. However, the user can coerce to character
through the numeric_dates
argument.
df <- data.frame(dte = as.Date("2018-01-01"))jsonify::to_json( df )# [{"dte":17532.0}]df <- data.frame(dte = as.Date("2018-01-01"))jsonify::to_json( df, numeric_dates = FALSE )# [{"dte":"2018-01-01"}]
POSIXct
and POSIXlt
?The same
jsonify::to_json( as.POSIXct("2018-01-01 10:00:00") )# [1514761200.0]jsonify::to_json( as.POSIXct("2018-01-01 10:00:00"), numeric_dates = FALSE)# ["2017-12-31T23:00:00"]
However, here the POSIXct values are returned in UTC timezone. This is by design.
POSIXlt will return each component of the date-time
x <- as.POSIXlt("2018-01-01 01:00:00", tz = "GMT")jsonify::to_json( x )# {"sec":[0.0],"min":[0],"hour":[1],"mday":[1],"mon":[0],"year":[118],"wday":[1],"yday":[0],"isdst":[0]}jsonify::to_json( x, numeric_dates = FALSE)# {"sec":[0.0],"min":[0],"hour":[1],"mday":[1],"mon":[0],"year":[118],"wday":[1],"yday":[0],"isdst":[0]}
The purpose of this library is speed. A lot of overhead is incurred
iterating over a list to find and convert objects from one type to
another.
For v0.2.0 I’ve managed to get the date handling at the c++ level, so there’s no penalty for recursing through the list to coerce to character.
Therefore, lists will work too
l <- list(dte = as.Date("2018-01-01"), psx = seq(as.POSIXct("2018-01-01 13:00:00"), as.POSIXct("2018-01-05 13:00:00"), length.out = 5), df = data.frame(psx = seq(as.POSIXct("2018-01-01 13:00:00"), as.POSIXct("2018-01-05 13:00:00"), length.out = 5)))jsonify::to_json( l )# {"dte":[17532.0],"psx":[1514772000,1514858400,1514944800,1515031200,1515117600],"df":[{"psx":1514772000},{"psx":1514858400},{"psx":1514944800},{"psx":1515031200},{"psx":1515117600}]}
js <- jsonify::to_json( l, numeric_dates = FALSE )jsonify::pretty_json( js )# {# "dte": [# "2018-01-01"# ],# "psx": [# "2018-01-01T02:00:00",# "2018-01-02T02:00:00",# "2018-01-03T02:00:00",# "2018-01-04T02:00:00",# "2018-01-05T02:00:00"# ],# "df": [# {# "psx": "2018-01-01T02:00:00"# },# {# "psx": "2018-01-02T02:00:00"# },# {# "psx": "2018-01-03T02:00:00"# },# {# "psx": "2018-01-04T02:00:00"# },# {# "psx": "2018-01-05T02:00:00"# }# ]# }
And it’s still fast because of the design choice to coerce dates to UTC. All the date handling is done at the C++ leve, not R. So it’s faster.
dtes <- seq(as.Date("2018-01-01"), as.Date("2019-01-01"), length.out = 365)psx <- seq(as.POSIXct("2018-01-01"), as.POSIXct("2019-01-01"), length.out = 365)n <- 1e5lst <- list(x = sample(dtes, size = n, replace = T), y = list(ya = sample(dtes, size = n, replace = TRUE), yb = rnorm(n = n), yx = list( sample(dtes, size = n, replace = T ) )), p = psx)library( microbenchmark )microbenchmark(jsonify1 = {jsonify::to_json( lst, numeric_dates = TRUE )},jsonify2 = {jsonify::to_json( lst, numeric_dates = FALSE )},jsonlite = {jsonlite::toJSON( lst )},times = 3)# Unit: milliseconds# expr min lq mean median uq max neval# jsonify1 57.61869 60.75988 63.8870 63.90106 67.02115 70.14124 3# jsonify2 308.03909 314.36571 317.2089 320.69232 321.79381 322.89529 3# jsonlite 692.80273 698.84349 714.4054 704.88426 725.20679 745.52933 3
pretty_json()
function new?Yep, it’s a new feature in v0.2.0
I want to be able to call the C++ code from another package, without going to & from R. Therefore, the C++ code is implemented in headers, so you can “link to” it in your own package.
For example, the LinkingTo
section in DESCRIPTION will look
something like
LinkingTo:Rcpp,jsonify
And in a c++ source file you can #include
the header and use the
available functions
// [[Rcpp::depends(jsonify)]]Rcpp::StringVector
Yes. Just like the examples in this readme use to_json()
df <- data.frame(id = 1:3, val = letters[1:3])jsonify::to_json( df )# [{"id":1,"val":"a"},{"id":2,"val":"b"},{"id":3,"val":"c"}]
Yep. Even though I constructed a data.frame()
without setting
stringsAsFactros = FALSE
, jsonify automatically treats factors as
strings.
Yes. And it’s to keep the data more inline with modern concepts and design patterns.
If you want factors, use factors_as_string = FALSE
in the to_json()
call
jsonify::to_json( df, factors_as_string = FALSE )# [{"id":1,"val":1},{"id":2,"val":2},{"id":3,"val":3}]
Get the latest release version from CRAN
install.packages("jsonify")
Or the development version from GitHub with:
# install.packages("devtools")devtools::install_github("SymbolixAU/jsonify")
as.json()
methodprint.json()
method to print json class objectsminify_json()
function to remove indentationpretty_json()
function to add indentationdigits
argument to specify the number of digits to which numeric values will be roundedunbox
argument for to_json()
NEWS.md
file to track changes to the package.