Conduct Simulation Studies with a Minimal Amount of Source Code

Tool for statistical simulations that have two components. One component generates the data and the other one analyzes the data. The main aims of the package are the reduction of the administrative source code (mainly loops and management code for the results) and a simple applicability of the package that allows the user to quickly learn how to work with it. Parallel computing is also supported. Finally, convenient functions are provided to summarize the simulation results.


BuildStatus lifecycle Project Status: Active – The project has reached a stable, usablestate and is being activelydeveloped. CRAN_Status_Badge_version_ago metacrandownloads license

An R-Package that facilitates simulation studies. It disengages the researcher from administrative source code.

The simTool package is designed for statistical simulations that have two components. One component generates the data and the other one analyzes the data. The main aims of the simTool package are the reduction of the administrative source code (mainly loops and management code for the results) and a simple applicability of the package that allows the user to quickly learn how to work with the simTool package. Parallel computing is also supported. Finally, convenient functions are provided to summarize the simulation results.

Example

This small simulation (using 4 cores) illustrates how the confidence interval based on the t-distribution performs on exponential distributed random variables. The following lines generate exponential distributed random variables of size 10, 50, 100, and 1000. Afterwards the t.test using confidence levels 0.8, 0.9, 0.95 are applied. This is repeated 1000 times to estimate the coverage:

library(simTool)
dg <- expand_tibble(fun = "rexp", rate = 10, n = c(10L, 50L, 100L, 1000L))
pg <- expand_tibble(proc = "t.test", conf.level = c(0.8, 0.9, 0.95))
et <- eval_tibbles(dg, pg, 
  ncpus = 4,
  replications = 10^3,
  post_analyze = function(ttest) tibble::tibble(
    coverage = ttest$conf.int[1] <= 1 / 10 && 1 / 10 <= ttest$conf.int[2]),
  summary_fun = list(mean = mean)
)
et
#> # A tibble: 12 x 8
#>    fun    rate     n replications summary_fun proc   conf.level coverage
#>    <chr> <dbl> <int>        <int> <chr>       <chr>       <dbl>    <dbl>
#>  1 rexp     10    10            1 mean        t.test       0.8     0.754
#>  2 rexp     10    10            1 mean        t.test       0.9     0.855
#>  3 rexp     10    10            1 mean        t.test       0.95    0.905
#>  4 rexp     10    50            1 mean        t.test       0.8     0.808
#>  5 rexp     10    50            1 mean        t.test       0.9     0.905
#>  6 rexp     10    50            1 mean        t.test       0.95    0.945
#>  7 rexp     10   100            1 mean        t.test       0.8     0.792
#>  8 rexp     10   100            1 mean        t.test       0.9     0.895
#>  9 rexp     10   100            1 mean        t.test       0.95    0.936
#> 10 rexp     10  1000            1 mean        t.test       0.8     0.796
#> 11 rexp     10  1000            1 mean        t.test       0.9     0.897
#> 12 rexp     10  1000            1 mean        t.test       0.95    0.953
#> Number of data generating functions: 4
#> Number of analyzing procedures: 3
#> Number of replications: 1000
#> Estimated replications per hour: 754228
#> Start of the simulation: 2019-02-01 23:25:01
#> End of the simulation: 2019-02-01 23:25:05

Installation

You can install simTool from github with:

install.packages("simTool")
# install.packages("devtools")
devtools::install_github("MarselScheer/simTool")

Or from CRAN with:

install.packages("simTool")

News

Version 1.1.2

Misc:

  • using a workaround in examples and vignette to circumvent a bug introduced in purrr 0.3.0 (https://github.com/tidyverse/purrr/issues/629)

Version 1.1.1

Misc:

  • .truth-functionality added, i.e. the parameters of the data generation (or alternatively a column of the data generating matrix with the name .truth) is passed to the the data analyzing functions, see the vignette for details
  • Unnesting of the simulation results improved

Version 1.1.0

Misc:

  • Refactoring in order to remove the dependency on reshape and plyr
  • The simulation itself is now a tibble instead of a list of lists

Version 1.0.3

Misc:

  • Adapted how libraries are loaded onto the cluster

Version 1.0.2

New Features:

  • The convenient function meanAndNormCI added

Misc:

  • Two parameter renamed (post.proc to summary.fun and value.fun to convert.result.fun). Of course, renaming parameters is one of the worst things one can do. On the other hand, only very few users will be affected by these changes.

Version 1.0.1

New Features:

  • summarizing functions process also logical results (not only numeric)

Misc:

  • HTML vignette (LaTeX not necessary anymore)
  • additional documentation (static pdf in JSS style)

Version 1.0

Initial release

  • parallel computing via parallel and the ideas of L'Ecuyer 1999, 2002 for random numbers
  • fallback capability
  • text progress bar
  • estimation of the number of replications per hour

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("simTool")

1.1.3 by Marsel Scheer, 4 months ago


https://github.com/MarselScheer/simTool


Report a bug at https://github.com/MarselScheer/simTool/issues


Browse source code at https://github.com/cran/simTool


Authors: Marsel Scheer [aut, cre]


Documentation:   PDF Manual  


GPL-3 license


Imports plyr, reshape, dplyr, purrr, tidyr, tibble, parallel, methods

Suggests ggplot2, knitr, boot, broom, testthat, rmarkdown


See at CRAN