Simplifies Pairwise Statistical Analyses

Pairwise group comparisons are often performed. While there are many packages that can perform these analyses, often it is the case that only a subset of comparisons are desired. 'SimplifyStats' performs pairwise comparisons and returns the results in a tidy fashion.


In many analyses, pairwise group comparisons or groupwise descriptive statistics are produced for numerous variables. 'SimplifyStats' is an R package consisting of a set of functions that simplify this process.

CRAN_Status_Badge Travis-CI Build Status AppVeyor Build Status codecov DOI

Functions by category

Groupwise descriptive statistics

The function group_summarize accepts a data frame as input and uses the names of user-specified columns of grouping variables to partition the data. For each unique combination of interactions between the grouping variables, univariate descriptive statistics are computed for another set of user-specified columns of numeric variables.

The specific statistics computed are:

  • Sample size (N)
  • Mean
  • Standard deviation (StdDev)
  • Standard error (StdErr)
  • Minimum value (Min)
  • First quartile value (Quartile1)
  • Median
  • Third quartile value (Quartile3)
  • Maximum value (Max)
  • Proportion of missing values (PropNA)
  • Kurtosis
  • Skewness
  • Jarque-Bera test P value (Jarque-Bera_p.value)
  • Shapiro-Wilk test P value (Shapiro-Wilk_p.value)

These values are returned in an object of class group_summary, which holds the results and the input parameters (excluding the input data frame). The results are stored in a list of data frames where each element of the list is named according to the variable for which statistics were computed. Additional parameters, i.e. na.rm = TRUE, can be passed to group_summarize.

Pairwise hypothesis testing

Like group_summary, the function pairwise_stats accepts as input a data frame and the names of user-specified columns of grouping variables. Unlike group_summary, pairwise_stats can accept only one numeric variable for analysis. Using a user-specified function, which must accept as both its first and second argument a vector of values corresponding to each group (i.e. t.test, wilcox.test, ks.test, or a custom function f(a,b)), every combination of group comparisons are made. In some cases the order in which these vectors are passed to the function matters, i.e. when settting alternative = "greater" in t.test. To account for this possiblity two_way = TRUE can be passed to group_summarize. This will test all possible pairs of unique grouping variable interactions in forward and reverse order. With this function, all two sample hypothesis tests can be quickly computed.

News

SimplifyStats 2.0.2

  • Updated output and provided argument to request legacy output format.
  • Updated the pairwise_stats function to enable evaluation of multiple variables in a single call.
  • Update examples and tests to handle the updated R RNG method.
  • Switched to the MIT license

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("SimplifyStats")

2.0.2 by Zachary Colburn, 9 months ago


Browse source code at https://github.com/cran/SimplifyStats


Authors: Zachary Colburn


Documentation:   PDF Manual  


MIT + file LICENSE license


Imports assertthat, tibble, dplyr, broom, moments

Suggests testthat, knitr, rmarkdown


See at CRAN