Simplifies Pairwise Statistical Analyses

Pairwise group comparisons are often performed. While there are many packages that can perform these analyses, often it is the case that only a subset of comparisons are desired. 'SimplifyStats' performs pairwise comparisons and returns the results in a tidy fashion.

In many analyses, pairwise group comparisons or groupwise descriptive statistics are produced for numerous variables. 'SimplifyStats' is an R package consisting of a set of functions that simplify this process.

CRAN_Status_Badge Travis-CI Build Status AppVeyor Build Status codecov DOI

Functions by category

Groupwise descriptive statistics

The function group_summarize accepts a data frame as input and uses the names of user-specified columns of grouping variables to partition the data. For each unique combination of interactions between the grouping variables, univariate descriptive statistics are computed for another set of user-specified columns of numeric variables.

The specific statistics computed are:

  • Sample size (N)
  • Mean
  • Standard deviation (StdDev)
  • Standard error (StdErr)
  • Minimum value (Min)
  • First quartile value (Quartile1)
  • Median
  • Third quartile value (Quartile3)
  • Maximum value (Max)
  • Proportion of missing values (PropNA)
  • Kurtosis
  • Skewness
  • Jarque-Bera test P value (Jarque-Bera_p.value)
  • Shapiro-Wilk test P value (Shapiro-Wilk_p.value)

These values are returned in an object of class group_summary, which holds the results and the input parameters (excluding the input data frame). The results are stored in a list of data frames where each element of the list is named according to the variable for which statistics were computed. Additional parameters, i.e. na.rm = TRUE, can be passed to group_summarize.

Pairwise hypothesis testing

Like group_summary, the function pairwise_stats accepts as input a data frame and the names of user-specified columns of grouping variables. Unlike group_summary, pairwise_stats can accept only one numeric variable for analysis. Using a user-specified function, which must accept as both its first and second argument a vector of values corresponding to each group (i.e. t.test, wilcox.test, ks.test, or a custom function f(a,b)), every combination of group comparisons are made. In some cases the order in which these vectors are passed to the function matters, i.e. when settting alternative = "greater" in t.test. To account for this possiblity two_way = TRUE can be passed to group_summarize. This will test all possible pairs of unique grouping variable interactions in forward and reverse order. With this function, all two sample hypothesis tests can be quickly computed.


SimplifyStats 2.0.2

  • Updated output and provided argument to request legacy output format.
  • Updated the pairwise_stats function to enable evaluation of multiple variables in a single call.
  • Update examples and tests to handle the updated R RNG method.
  • Switched to the MIT license

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


2.0.4 by Zachary Colburn, a year ago

Browse source code at

Authors: Zachary Colburn

Documentation:   PDF Manual  

MIT + file LICENSE license

Imports assertthat, tibble, dplyr, broom, moments

Suggests testthat, knitr, rmarkdown

See at CRAN