# A Tidy Interface for Simulating Multivariate Data

Provides pipe-friendly (%>%) wrapper functions for MASS::mvrnorm() to create simulated multivariate data sets with groups of variables with different degrees of variance, covariance, and effect size.

`holodeck` allows quick and simple creation of simulated multivariate data with variables that co-vary or discriminate between levels of a categorical variable. The resulting simulated multivariate dataframes are useful for testing the performance of multivariate statistical techniques under different scenarios, power analysis, or just doing a sanity check when trying out a new multivariate method.

## Installation

`holodeck` is currently not on CRAN, but you can install it from github with the following R code:

`holodeck` is built to work with `dplyr` functions, including `group_by()` and the pipe (`%>%`). `purrr` is helpful for iterating simulated data. For these examples I’ll use `ropls` for PCA and PLS-DA.

## Example 1: Investigating PCA and PLS-DA

Let’s say we want to learn more about how principal component analysis (PCA) works. Specifically, what matters more in terms of creating a principal component—variance or covariance of variables? To this end, you might create a dataframe with a few variables with high covariance and low variance and another set of variables with low covariance and high variance

### Generate data

Explore covariance structure visually. The diagonal is variance.

Now let’s make this dataset a little more complex. We can add a factor variable, some variables that discriminate between the levels of that factor, and add some missing values.

### PCA

It looks like PCA mostly picks up on the variables with high covariance, not the variables that discriminate among levels of `factor`. This makes sense, as PCA is an unsupervised analysis.

### PLS-DA

PLS-DA, a supervised analysis, finds discrimination among groups and finds that the discriminating variables we generated are most responsible for those differences.

# holodeck 0.2.0

• Changed argument names. (`p` -> `n_vars`, `N` -> `n_obs`)
• Removed helpers/wrappers to the `ropls` package. These can now be found as part of https://github.com/Aariq/chemhelper
• Actually ready for CRAN submission

# holodeck 0.1.0

• Updated DESCRIPTION

# holodeck 0.0.0.9000

• Changed package name to `holodeck`

# tidymvsim 0.0.0.9000

• Added a `NEWS.md` file to track changes to the package.

# Reference manual

install.packages("holodeck")

0.2.1 by Eric Scott, a year ago

https://github.com/Aariq/holodeck

Report a bug at https://github.com/Aariq/holodeck/issues

Browse source code at https://github.com/cran/holodeck

Authors: Eric Scott [aut, cre]

Documentation:   PDF Manual