A 'sparklyr' Extension for Nested Data

A 'sparklyr' extension adding the capability to work easily with nested data.


sparklyr.nested 0.0.3

  • Initial release
  • To support read schemas, array_type, binary_type, boolean_type, byte_type, character_type, date_type, double_type, float_type, integer_type, long_type, map_type, numeric_type, string_type, struct_field, struct_type, and timestamp_type are provided. These will return java references neede for schema definition.
  • sdf_select enables access to nested fields (instide maps, structs, and arrays of structs).
  • sdf_explode will convert arrays of structs to simple structs by duplicating rows over each unique array entry. Also works on map types if is_map=TRUE.
  • sdf_nest and sdf_unnest are similar to the similarly named tidyr functions. sdf_unnest essentially chains sdf_explode and sdf_select with some schema inspection.
  • sdf_schema_json and sdf_schema_viewer support schema interrogation

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


0.0.3 by Matt Pollock, a year ago

Report a bug at https://github.com/mitre/sparklyr.nested/issues

Browse source code at https://github.com/cran/sparklyr.nested

Authors: Matt Pollock [aut, cre] , The MITRE Corporation [cph]

Documentation:   PDF Manual  

Apache License 2.0 | file LICENSE license

Imports sparklyr, jsonlite, listviewer, dplyr, rlang, purrr

Suggests testthat

System requirements: Spark: 1.6.x or 2.x

See at CRAN