Tidyverse-Compatible Bootstrapping

Compute arbitrary non-parametric bootstrap statistics on data in tidy data frames.


tidyboot let's you compute arbitrary non-parametric bootstrap statistics on data in tidy data frames.

Installation

You can install tidyboot from CRAN with:

install.packages("tidyboot")

You can install tidyboot from github with:

devtools::install_github("langcog/tidyboot")

Examples

For the simplest use case of bootstrapping the mean and getting the mean and confidence interval of that estimate, use the convenience function tidyboot_mean(), specifying which column has the relevant values to compute the mean over:

library(dplyr)
library(tidyboot)
 
gauss1 <- data_frame(value = rnorm(500, mean = 0, sd = 1), condition = 1)
gauss2 <- data_frame(value = rnorm(500, mean = 2, sd = 3), condition = 2)
df <- bind_rows(gauss1, gauss2)
 
df %>%
  group_by(condition) %>%
  tidyboot_mean(column = value)
#> # A tibble: 2 x 6
#>   condition     n empirical_stat  ci_lower        mean   ci_upper
#>       <dbl> <int>          <dbl>     <dbl>       <dbl>      <dbl>
#> 1         1   500    -0.02292831 -0.114664 -0.02196269 0.06232765
#> 2         2   500     2.02781100  1.755141  2.02955494 2.30626508

For bootstrapping any statistic and any properties of its sampling distribution, use tidyboot().

You can provide the statistic to be estimated either as a function and a column to compute it over, or as function that takes the whole dataframe and computes the relevant value.

Similarly, you can provide the properties of the sampling distribution to be computed either as a named list of functions and a column to compute them over, or a function that takes the whole dataframe and returns the relevant values.

df %>%
  group_by(condition) %>%
  tidyboot(column = value, summary_function = median,
           statistics_functions = list("mean" = mean, "sd" = sd))
#> # A tibble: 2 x 5
#>   condition     n empirical_median         mean         sd
#>       <dbl> <int>            <dbl>        <dbl>      <dbl>
#> 1         1   500     -0.006468345 -0.005842842 0.05544625
#> 2         2   500      1.995112455  2.011381756 0.20658963
df %>%
  group_by(condition) %>%
  tidyboot(summary_function = function(x) x %>% summarise(median = median(value)),
           statistics_functions = function(x) x %>% summarise_at(vars(median), funs(mean, sd)))
#> # A tibble: 2 x 5
#>   condition     n empirical_median         mean         sd
#>       <dbl> <int>            <dbl>        <dbl>      <dbl>
#> 1         1   500     -0.006468345 -0.006287976 0.05505858
#> 2         2   500      1.995112455  2.006877243 0.20645516

News

  • column naming bug fix

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("tidyboot")

0.1.1 by Mika Braginsky, a year ago


https://github.com/langcog/tidyboot


Report a bug at http://github.com/langcog/tidyboot/issues


Browse source code at https://github.com/cran/tidyboot


Authors: Mika Braginsky [aut, cre] , Daniel Yurovsky [aut]


Documentation:   PDF Manual  


GPL-3 license


Imports dplyr, modelr, purrr, rlang, tidyr


See at CRAN