Bayesian Framework for Computational Modeling

Derived from the work of Kruschke (2015, ), the present package aims to provide a framework for conducting Bayesian analysis using Markov chain Monte Carlo (MCMC) sampling utilizing the Just Another Gibbs Sampler ('JAGS', Plummer, 2003, < http://mcmc-jags.sourceforge.net/>). The initial version includes several modules for conducting Bayesian equivalents of chi-squared tests, analysis of variance (ANOVA), multiple (hierarchical) regression, softmax regression, and for fitting data (e.g., structural equation modeling).


bfw: Bayesian Framework for Computational Modeling

Logo II

News CRAN Version GitHub Version
License Build Status

The purpose of bfw is to establish a framework for conducting Bayesian analysis in R, using MCMC and JAGS (Plummer, 2003). The framework provides several modules to conduct linear and non-linear (hierarchical) analyses, and allows the use of custom functions and complex JAGS models.

Derived from the excellent work of Kruschke (2015), the goal of the framework is to easily estimate parameter values and the stability of estimates from the highest density interval (HDI), make null value assessment through region of practical equivalence testing (ROPE) and conduct convergence diagnostics (e.g., Gelman & Rubin, 1992).

Users are encouraged to apply justified priors by modifying existing JAGS models found in extdata/models or by adding custom models. Similarly, one might modify models to conduct posterior predictive checks (see Kruschke, 2013). The purpose of the framework is not to provide generic modules suitable for all circumstances, but rather act as a platform for modifying or developing models for a given project.

List of current modules

  • Bernoulli trials
  • Covariate estimations (including correlation and Cronbach's alpha)
  • Fit observed and latent data (e.g., SEM, CFA, mediation models)
  • Bayesian equivalent of Cohen's kappa
  • Mean and standard deviation estimations
  • Predict metric values (cf., ANOVA)
  • Predict nominal values (cf., chi-square test)
  • Simple, multiple and hierarchical regression
  • Softmax regression (i.e., multinomial logistic regression)

List of current visualizations

  • Plot density of parameter values (including ROPE)
  • Plot mean data (including repeated measures)
  • Plot nominal data (e.g., expected and observed values)
  • Plot circlize data (e.g., multiple response categories)

Prerequisites

Dependencies

Dependencies are automatically installed from CRAN. By default, outdated dependencies are automatically upgraded.

Installing

You can install bfw from GitHub. If you already have a previous version of bfw installed, using the command below will update to the latest development version.

Development version (GitHub)

devtools::install_github("oeysan/bfw")

Please note that stable versions are hosted at CRAN, whereas GitHub versions are in active development.

Stable version (CRAN)

utils::install.packages("bfw")

Issues

Please report any bugs/issues here

Example 1: Normal distributed data

Compute mean and standard deviation estimates.
Please see manual for more examples.

# Apply MASS to create normal distributed data with mean = 0 and standard deviation = 1
set.seed(99)
data <- data.frame(y =
                     MASS::mvrnorm(n=100,
                                   mu=0,
                                   Sigma=matrix(1, 1),
                                   empirical=TRUE) )

# Run normal distribution analysis on data
## Use 50 000 iterations, set seed for replication.
### Null value assessment is set at ROPE = -0.5 - 0.5 for mean
mcmc <- bfw(project.data = data,
            y = "y",
            saved.steps = 50000,
            jags.model = "mean",
            jags.seed = 100,
            ROPE = c(-0.5,0.5),
            silent = TRUE)

# Run t-distribution analysis on data
mcmc.robust <- bfw(project.data = data,
                   y = "y",
                   saved.steps = 50000,
                   jags.model = "mean",
                   jags.seed = 101,
                   ROPE = c(-0.5,0.5),
                   run.robust = TRUE,
                   silent = TRUE)

# Use psych to describe the normally distributed data
psych::describe(data)[,c(2:12)]
#>      n mean sd median trimmed  mad   min  max range  skew kurtosis
#> X1 100    0  1   0.17    0.04 0.81 -3.26 2.19  5.45 -0.57     0.53

# Print summary of normal distribution analysis
## Only the most relevant information is shown here
round(mcmc$summary.MCMC[,c(3:6,9:12)],3)
#>               Mode   ESS  HDIlo HDIhi ROPElo ROPEhi ROPEin   n
#> mu[1]: Y    -0.003 49830 -0.201 0.196      0      0    100 100
#> sigma[1]: Y  0.995 48890  0.869 1.150      0    100      0 100

# Print summary of t-distribution analysis
round(mcmc.robust$summary.MCMC[,c(3:6,9:12)],3)
#>              Mode   ESS  HDIlo HDIhi ROPElo ROPEhi ROPEin   n
#> mu[1]: Y    0.027 25887 -0.167 0.229      0      0    100 100
#> sigma[1]: Y 0.933 10275  0.749 1.115      0    100      0 100

Example 2: Same data but with outliers

# Add 10 outliers, each with a value of 10.
biased.data <- rbind(data,data.frame(y = rep(10,10)))

# Run normal distribution analyis on biased data
biased.mcmc <- bfw(project.data = biased.data,
                   y = "y",
                   saved.steps = 50000,
                   jags.model = "mean",
                   jags.seed = 102,
                   ROPE = c(-0.5,0.5),
                   silent = TRUE)

# Run t-distribution analysis on biased data
biased.mcmc.robust <- bfw(project.data = biased.data,
                          y = "y",
                          saved.steps = 50000,
                          jags.model = "mean",
                          jags.seed = 103,
                          ROPE = c(-0.5,0.5),
                          run.robust =TRUE,
                          silent = TRUE)

# Use psych to describe the biased data
psych::describe(biased.data)[,c(2:12)]
#>      n mean   sd median trimmed  mad   min max range skew kurtosis
#> X1 110 0.91 3.04   0.25    0.21 0.92 -3.26  10  13.3  2.3     4.38

# Print summary of normal distribution analysis on biased data
## Only the most relevant information is shown here
round(biased.mcmc$summary.MCMC[,c(3:6,9:12)],3)
#>              Mode   ESS HDIlo HDIhi ROPElo ROPEhi ROPEin   n
#> mu[1]: Y    0.922 50000 0.346  1.50      0     92   8.05 110
#> sigma[1]: Y 2.995 49069 2.662  3.48      0    100   0.00 110

# # Print summary of t-distribution analysis on biased data
round(biased.mcmc.robust$summary.MCMC[,c(3:6,9:12)],3)
#>              Mode   ESS  HDIlo HDIhi ROPElo ROPEhi  ROPEin   n
#> mu[1]: Y    0.168 29405 -0.015 0.355      0  0.008  99.992 110
#> sigma[1]: Y 0.679 17597  0.512 0.903      0 99.128   0.872 110

Example 3: Custom function and model

Shamelessly adapted from here (credits to James Curran).

# Create a function for left-censored data
custom.function <- function(DF, ...) {

  x <- as.vector(unlist(DF))
  x[x < log(29)] = NA
  n <- length(x)
  LOD <- rep(log(29), n)
  is.above.LOD <- ifelse(!is.na(x), 1, 0)
  lambda <- 0.5
  initial.x = rep(NA, length(x))
  n.missing <- sum(!is.above.LOD)
  initial.x[!is.above.LOD] <- runif(n.missing, 0, log(29))

  initial.list <- list(lambda = lambda,
                       x = initial.x
  )

  data.list <- list(n = n,
                    x = x,
                    LOD = LOD,
                    is.above.LOD = is.above.LOD
  )

  # Return data list
  return (
    list (
      params = "lambda",
      initial.list = initial.list,
      data.list = data.list,
      n.data = as.matrix(n)
      )
    )
}

# Create a model
custom.model = "
  model{
    for(i in 1:n){
      is.above.LOD[i] ~ dinterval(x[i], LOD[i])
      x[i] ~ dexp(lambda)
    }
    lambda ~ dgamma(0.001, 0.001)
  }
"

# Simulate some data
set.seed(35202)
project.data <- as.matrix(rexp(10000, rate = 1.05))

# Run analysis
custom.mcmc <- bfw(project.data = project.data,
                   custom.function = custom.function,
                   custom.model = custom.model,
                   saved.steps = 50000,
                   jags.seed = 100,
                   ROPE = c(1.01,1.05),
                   silent = TRUE)

# Print analysis
round(custom.mcmc$summary.MCMC[,c(3,5,6,9:12)],3)
#>     Mode    HDIlo    HDIhi   ROPElo   ROPEhi   ROPEin        n 
#>     1.03     1.00     1.06     7.64    14.33    78.03 10000.00

The cost of conducting robust estimates

# Running time for normal distribution analyis
biased.mcmc$run.time[2] - biased.mcmc$run.time[1]
#> Time difference of 7.66 secs

# Running time for t-distribution analysis
biased.mcmc.robust$run.time[2] - biased.mcmc.robust$run.time[1]
#> Time difference of 32.9 secs

License

This project is licensed under the MIT License - see LICENSE.md for details

Acknowledgments

  • John Kruschke for his overall amazing work, and especially for his workshop at ICPSR 2016. It opened my eyes to Bayesian statistics.
  • Martyn Plummer for his work on JAGS. Cheers!

References

  • Gelman, A., & Rubin, D. B. (1992). Inference from Iterative Simulation Using Multiple Sequences. Statistical Science, 7(4), 457-472. https://doi.org/10.1214/ss/1177011136
  • Kruschke, J. K. (2013). Posterior predictive checks can and should be Bayesian: Comment on Gelman and Shalizi, 'Philosophy and the practice of Bayesian statistics'. British Journal of Mathematical and Statistical Psychology, 66(1), 4556. https://doi.org/10.1111/j.2044-8317.2012.02063.x
  • Kruschke, J. K. (2015). Doing Bayesian data analysis: a tutorial with R, JAGS, and Stan. Academic Press: Boston.
  • Plummer, M. (2003). JAGS A program for analysis of Bayesian graphical models using Gibbs sampling (Version 4.3.0). Retrieved from http://mcmc-jags.sourceforge.net/

News

bfw 0.4.0

Critical

  • Fixed an sorting error in contrasts function

Minor

  • Added the ability to use custom save names
  • Removed the appended "Plot" in save names for ParsePlot function

bfw 0.3.0.9002

Minor

  • Fixed factor-to-numeric-bug metric function

bfw 0.3.0.9001

Feature

  • Added InverseHDI function to compute inverse cumulative density function of the distribution

Minor

  • Added option to add results to ETA function

bfw 0.3.0

Feature

  • Added PlotParam function to plot density of parameter values (including ROPE)

Moderate

  • Removed PlotData function. All plots are now called from seperate functions:
    • PlotCirclize to create a circlize plot
    • PlotMean to create a mean plot
    • PlotNominal to create a nominal plot
    • PlotParam to create a density plot with parameter values

Minor

  • Small fix in SumMCMC function to compute sample sizes

bfw 0.2.0.9005

Feature

  • Updated CFA function to include correlation matrix
  • Added a option to run PPP for every kth length of MCMC chains (Default is every 10th)

Moderate

  • Optimized RunContrasts to allow larger MCMC simulations (2nd review)

Minor

  • Fixed plot_data vignette
  • Updated README
  • Fixed title bug in circlize plots
  • Added RemoveGarbage function to clear up working memory
  • Added MultiGrep function to use multiple patterns to select an element from a vector
  • Fixed bug in kappa function
  • Fixed bug in covariate function
  • Fixed inline comment bug in TidyCode function
  • Added option to define which parameters to use for diagnostics
  • Removed (some of the...) unnecessary arguments in bfw function
  • Added a apa PowerPoint template

bfw 0.2.0.9004

Feature

  • Added nominal and circlize (using the circlize package) plot types
    • mean plots are now seperated from main PlotData function
    • ParsePlot and PlotData functions are also seperated

Critical

  • Fixed an error in nominal function

bfw 0.2.0.9003

Minor

  • Fixed variables namnes in nominal function when using only 1 variable

bfw 0.2.0.9003

Minor

  • Fixed issue of line break after HTML tags when rendering Rmd files.

bfw 0.2.0.9002

Moderate

  • Fixed ParsePlot to accommodate ggplots.
    • PlotData now returns ggplot2 and not grDevices::recordPlot()

Minor

  • Added a second badge, a more informative badge, 'cos why not.

bfw 0.2.0.9001

Minor

  • Some typos

bfw 0.2.0.9000

Minor

  • Added a badge because all the cool kids have them
  • Fixed small inconsistencies in ParsePlot parameters (i.e., type png with layout pw).
    • Defaults are now rasterized pdf on a4 layout with 12 pointsize.

bfw 0.2.0

Critical

  • Optimized RunContrasts to allow larger MCMC simulations

Moderate

  • Optimized ParsePlot to handle large amounts of plots
  • Added png package to Suggests to handle rasterized graphics for pdf/ps.
  • Created a small function called TidyCode to format messy code.
  • Modified nominal and metric functions to use a single flexible jags model for each function respectively
  • Added an extended covariate function to include y and x variables.
  • Redefined fit data models
    • Created a separate model for confirmatory factor analysis using Wishart distribution
    • Redefined structural equation models, making the models more efficient
  • Added options to have more control over JAGS simulations
    • jags.method, specify method for JAGS (e.g., parallel or simple)
    • jags.chains, specify specify number of chains for JAGS
    • adapt.steps, the number of adaptive iterations to use at the start of each simulation
    • burnin.steps, the number of burnin iterations, NOT including the adaptive iterations to use for the simulation.

Minor

  • Added a small ETA function to display running time of functions
  • Made feedback from MCMC functions more informative
  • Fixed custom model in settings
  • Fixed job.title bug in covariate
  • Fixed NULL values in TrimSplit and CapWords
  • Replaced robust estimates of R^2 with Spearman
  • Moved vignettes from vignettes to doc

bfw 0.1.0

Moderate

  • Migrated from the orphaned ReporteRs to officer (thanks to Professor Brian Ripley at University of Oxford for notifying me)
    • Added two PowerPoint templates legacy (4:3) and widescreen (16:9)
  • The following packages are moved from Imports to Suggests
    • ggplot2
    • lavaan
    • officer
    • psych
    • robust
    • rvg
    • scales
    • truncnorm

Minor

  • Modified title of package from Computational Modelling to Computational Modeling to conform with US spelling
  • Recoded diagnostics, making the code more efficent.
  • Correcting some typos and code aesthetics (e.g., replaced print with cat to display running time of analyses)
  • Reviewed TrimSplit function to include removing empty elements from vector
  • Removed methods and rJava from imports

bfw 0.0.1

  • Initial launch with the following modules:
    • Bernoulli trials
    • Covariate estimations (including correlation and Cronbach`s alpha)
    • Fit data (e.g., SEM, CFA, mediation models)
    • Bayesian equivalent of Cohen`s kappa
    • Mean and standard deviation estimations
    • Predict metric values (cf., ANOVA)
    • Predict nominal values (c.f., chi-square test)
    • Simple, multiple and hierarchical regression
    • Softmax regression (i.e., multinomial logistic regression)
  • Added a NEWS.md file to track changes to the package.
  • Added a TODO.md file to track future work on the package.
  • Added a README.md file as an introduction to the package.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("bfw")

0.4.0 by Øystein Olav Skaar, 4 months ago


https://github.com/oeysan/bfw/


Report a bug at https://github.com/oeysan/bfw/issues/


Browse source code at https://github.com/cran/bfw


Authors: Øystein Olav Skaar [aut, cre]


Documentation:   PDF Manual  


MIT + file LICENSE license


Imports coda, MASS, runjags

Suggests covr, circlize, dplyr, ggplot2, knitr, lavaan, magrittr, officer, plyr, png, psych, rmarkdown, rvg, scales, testthat

System requirements: JAGS >=4.3.0 <http://mcmc-jags.sourceforge.net/>, Java JDK >=1.4 <https://www.java.com/en/download/manual.jsp>


See at CRAN