A 'Sparklyr' Extension for 'VariantSpark'

This is a 'sparklyr' extension integrating 'VariantSpark' and R. 'VariantSpark' is a framework based on 'scala' and 'spark' to analyze genome datasets, see < https://bioinformatics.csiro.au/>. It was tested on datasets with 3000 samples each one containing 80 million features in either unsupervised clustering approaches and supervised applications, like classification and regression. The genome datasets are usually writing in VCF, a specific text file format used in bioinformatics for storing gene sequence variations. So, 'VariantSpark' is a great tool for genome research, because it is able to read VCF files, run analyses and return the output in a 'spark' data frame.


News

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("variantspark")

0.1.1 by Samuel Macêdo, a year ago


Browse source code at https://github.com/cran/variantspark


Authors: Samuel Macêdo [aut, cre] , Javier Luraschi [aut]


Documentation:   PDF Manual  


Apache License 2.0 | file LICENSE license


Imports sparklyr

Suggests testthat


See at CRAN