Implementation of Frequent-Directions Algorithm for Efficient Matrix Sketching

Implement frequent-directions algorithm for efficient matrix sketching. (Edo Liberty (2013) ).


Implementation of Frequent-Directions algorithm for efficient matrix sketching [E. Liberty, SIGKDD2013]

Installation

# Not yet onCRAN
install.packages("frequentdirections")
 
# Or the development version from GitHub:
install.packages("devtools")
devtools::install_github("shinichi-takayanagi/frequentdirections")

Example

Download example data

Here, we use Handwritten digits USPS dataset as sample data. In the following example, we assume that you save the above sample data into /tmp directory.

Load data

The dataset has 7291 train and 2007 test images in h5 format. The images are 16*16 grayscale pixels.

library("h5")
file <- h5file("/tmp/usps.h5")
x <- file["train/data"][]
y <- file["train/target"][]
str(x)
#>  num [1:7291, 1:256] 0 0 0 0 0 0 0 0 0 0 ...

Plot example image

Example the number 8

image(matrix(x[338,], nrow=16, byrow = FALSE))

Plot SVD

Plot the original data on the first and second singular vector plane.

x <- scale(x)
frequentdirections::plot_svd(x, y)

Matrix Sketching

l = 8 case

eps <- 10^(-8)
# 7291 x 256 -> 8 * 256 matrix
b <- frequentdirections::sketching(x, 8, eps)
frequentdirections::plot_svd(x, y, b)

l = 32 case

# 7291 x 256 -> 32 * 256 matrix
b <- frequentdirections::sketching(x, 32, eps)
frequentdirections::plot_svd(x, y, b)

l = 128 case

# 7291 x 256 -> 128 * 256 matrix
b <- frequentdirections::sketching(x, 128, eps)
frequentdirections::plot_svd(x, y, b)

This result is almost the same with the original data SVD expression.

That’s why we can think that the original data is expressed with only 128 rows.

News

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("frequentdirections")

0.1.0 by Shinichi Takayanagi, a month ago


https://github.com/shinichi-takayanagi/frequentdirections


Report a bug at https://github.com/shinichi-takayanagi/frequentdirections/issues


Browse source code at https://github.com/cran/frequentdirections


Authors: Shinichi Takayanagi [aut, cre] , Nagi Teramo [aut]


Documentation:   PDF Manual  


MIT + file LICENSE license


Imports ggplot2

Suggests testthat, knitr, rmarkdown


See at CRAN