Implements fast, exact bootstrap Principal Component Analysis and Singular Value Decompositions for high dimensional data, as described in < http://arxiv.org/abs/1405.0922>. For data matrices that are too large to operate on in memory, users can input objects with class 'ff' (see the 'ff' package), where the actual data is stored on disk. In response, this package will implement a block matrix algebra procedure for calculating the principal components (PCs) and bootstrap PCs. Depending on options set by the user, the 'parallel' package can be used to parallelize the calculation of the bootstrap PCs.
The R package bootSVD can be used to implement fast, exact bootstrap principal component analysis and singular value decompositions for high dimensional data, where the number of measurements per subject is much larger than the number of subjects. This package is based on the methodology outlined by Fisher et al. (2014), who demonstrate the method on a dataset of 352 brain magnetic resonace images (MRIs), with approximately 3 million measurements per subject.
The primary function in this package is the bootSVD function, for which we include a documented example based on simulated sleep electroencephalogram (EEG) data. When the data is too large to store in memory, functions in this package can also be applied to objects of class
ff objects have a representation in memory, but store their primary contents on disk (see the ff package).
Speed improvements are driven by the fact that sample size (n) is much less than sample dimension, which allows a n-dimensional representation of the sample to be sufficient for most calculations.
install.packages("devtools")## main packagelibrary(devtools)install_github('bootSVD','aaronjfisher')library(bootSVD)## to access help pageshelp(package=bootSVD)?bootSVD
Aaron Fisher, Brian Caffo, and Vadim Zipunnikov. Fast, Exact Bootstrap Principal Component Analysis for p>1 million. Working Paper, 2014. http://arxiv.org/abs/1405.0922