Cognitive Models

Hierarchical Bayesian models. The package provides tools to fit two response time models, using the population-based Markov Chain Monte Carlo.


ggdmc, evolving from dynamic model of choice (DMC, Heathcote et al., 2018), is a generic tool for hierarchical Bayesian Computations.

  1. Instead of using Gibbs or HMC, ggdmc uses population-based MCMC (pMCMC) samplers. A notable Gibbs example is the Python-based HDDM (Wiecki, Sofer & Frank, 2013), which does not allow the user to conveniently set the variabilities of DDM parameters.

  2. ggdmc uses a different variant of migration operator, which safeguards the detailed balance. It is not imperative to turn off the migration operator. For some complex models, it is preferalbe to use a mixture of differernt genetic operators to prevent premature convergence. Further, pMCMC is more efficient when a combination of operators is used.

  3. ggdmc uses two parallel methods. First is via the parallel package in R. This facilitates the computations for fitting many participants, namely fixed-effects models. The second is via OpenMP at C++ level. This facilitates the computations for fitting hierarchical models. For an advanced parallel computation technique / algorithm, please see my R package, ppda, which implements CUDA C language for massive parallel computations.

Getting Started

Below is an example using the LBA Model (Brown & Heathcote, 2008). Please see my tutorials site for more details. The names of R functions in ggdmc mostly informs what they are doing, such as BuildModel. The syntax differs from DMC; Nevertheless, although ggdmc seemingly similar with DMC, its core of Bayesian sampling is a new implementation, because the different method of parallelism and pMCMC algorithms. Please use with your own risk.

Note the sequence of parameters in a parameter vector (i.e., p.vector) must strictly follow the sequence in the p.vector reported by BuildModel. Otherwise, run function will throw an error message to stop you from fitting a model.

require(ggdmc) 
model <- BuildModel(p.map = list(A = "1", B = "R", t0 = "1",
                            mean_v = c("F", "M"), sd_v = "M", st0 = "1"),
          match.map = list(M = list(s1 = 1, s2 = 2)),
          factors   = list(S = c("s1", "s2"), F = c("f1", "f2")),
          constants = c(sd_v.false = 1, st0 = 0), 
          responses = c("r1", "r2"),
          type      = "norm")

## Population distribution, rate effect on F
pop.mean <- c(A = .4, B.r1 = 1.2, B.r2 = 2.8, t0 = .2,
              mean_v.f1.true = 2.5, mean_v.f2.true = 1.5, mean_v.f1.false = .35,
              mean_v.f2.false = .25, sd_v.true = .25)
pop.scale <- c(A = .1, B.r1 = .1, B.r2 = .1, t0 = .05,
               mean_v.f1.true = .2, mean_v.f2.true = .2, mean_v.f1.false = .2,
               mean_v.f2.false = .2, sd_v.true = .1)
pop.prior <- BuildPrior(
  dists = rep("tnorm", 9),
  p1    = pop.mean,
  p2    = pop.scale,
  lower = c(0,   0,  0, .1, NA, NA, NA,  0,  0),
  upper = c(NA, NA, NA,  1, NA, NA, NA, NA, NA))

## Draw parameter prior distributions to visually check
plot(pop.prior)
print(pop.prior)

## Simulate 20 participants, each condition has 30 responses.
## The true parameter vectors are drawn from parameter prior distribution
## specified in 'pop.prior' 
dat <- simulate(model, nsim = 30, nsub = 20, prior = pop.prior)
dmi <- BuildDMI(dat, model)    ## dmi = data model instance 

## Extract the mean and variabilities of parameters across the 40 participants
## ps <- attr(dat, "parameters")
##
## Please use matrixStats package, which is even faster than the C functions 
## in base package
##
## round(matrixStats::colMeans2(ps), 2)
## round(matrixStats::colSds(ps), 2)
##    A  B.r1  B.r2  mean_v.f1.true  mean_v.f2.true mean_v.f1.false 
## 0.43  1.22  1.00            0.21            2.48            1.45 
## 0.10  0.10  0.01            0.04            0.17            0.19 
## mean_v.f2.false  sd_v.true     t0
##            0.34       0.36   0.23
##            0.16       0.18   0.10
           
## FIT hierarchical model
p.prior <- BuildPrior(
  dists = rep("tnorm", 9),
  p1    = pop.mean,
  p2    = pop.scale,
  lower = c(0,   0, 0, .1, NA, NA, NA,  0,  0),
  upper = c(NA, NA, 1, NA, NA, NA, NA, NA, NA))

## Specify prior distributions at the hyper level
mu.prior <- BuildPrior(
  dists = rep("tnorm", 9),
  p1    = pop.mean,                           
  p2    = c(1,   1,  1,  2,   2,  2,  2,  1, 1),
  lower = c(0,   0,  0, .1,  NA, NA, NA, NA, 0),
  upper = c(NA, NA, NA, NA,  NA, NA, NA, NA, NA))

## lower and upper are taken care of by defaults.
sigma.prior <- BuildPrior(
  dists = rep("beta", 9),
  p1    = c(A=1, B.r1=1, B.r2=1, t0 = 1, mean_v.f1.true=1, mean_v.f2.true=1,
            mean_v.f1.false=1, mean_v.f2.false=1, sd_v.true = 1),
  p2    = rep(1, 9))

pp.prior <- list(mu.prior, sigma.prior)

## Visually check mu priors and sigma priors
plot(pp.prior[[1]])
plot(pp.prior[[2]])

## Initialise a small sample 
## Get the number of parameters
npar <- length(GetPNames(model)); npar
thin <- 2

## Initiate 100 new hierarchical samples and specify (randomly) only the first 
## iteration.  Others are NA or -Inf. 
hsam <- StartNewHypersamples(1e2, dmi, p.prior, pp.prior)
hsam <- run(hsam, report = 20, pm = .3, hpm = .3)

How to pipe DMC samples to ggdmc samplers

Diffusion decision model (Ratcliff & McKoon, 2008) is one of the most popular cognitive models to fit RT data in cognitive psychology. Here we show two examples, one fitting the LBA model and the other fitting DDM.

###################
##   LBA model   ##
###################
## DMC could be downloaded at "osf.io/pbwx8".
setwd("~/Documents/DMCpaper/")
source ("dmc/dmc.R")
load_model ("LBA","lba_B.R")
setwd("~/Documents/ggdmc_paper/")
## load("data/dmc_pipe_LBA.rda")

model <- model.dmc(p.map = list( A = "1", B = "R", t0 = "1",
                                mean_v = c("F", "M"), sd_v = "M", st0 = "1"),
          match.map = list(M = list(s1 = 1, s2 = 2)),
          factors = list(S = c("s1", "s2"), F = c("f1", "f2")),
          constants = c(sd_v.false = 1, st0 = 0),
          responses = c("r1", "r2"),
          type = "norm")
pop.mean <- c(A=.4, B.r1=.6, B.r2=.8, t0=.3, mean_v.f1.true=1.5, 
              mean_v.f2.true=1, mean_v.f1.false=0, mean_v.f2.false=0,
              sd_v.true = .25)
pop.scale <-c(A=.1, B.r1=.1, B.r2=.1, t0=.05, mean_v.f1.true=.2, 
              mean_v.f2.true=.2, mean_v.f1.false=.2, mean_v.f2.false=.2,
              sd_v.true = .1)
pop.prior <- prior.p.dmc(
  dists = rep("tnorm",9),
  p1    = pop.mean,
  p2    = pop.scale,
  lower = c(0,0,0,NA,NA,NA,NA,0,.1),
  upper = c(NA,NA,NA,NA,NA,NA,NA,NA,1))
raw.data <- h.simulate.dmc(model, p.prior = pop.prior, n = 30, ns = 20)
data.model <- data.model.dmc(raw.data, model)

ps <- attr(raw.data, "parameters")

p.prior <- prior.p.dmc(
  dists = rep("tnorm",9),
  p1    = pop.mean,
  p2    = pop.scale*5,
  lower = c(0,0,0,NA,NA,NA,NA,0,.1),
  upper = c(NA,NA,NA,NA,NA,NA,NA,NA,NA))
mu.prior <- prior.p.dmc(
  dists = rep("tnorm",9),
  p1    = pop.mean,
  p2    = c(1,1,1,2,2,2,2,1,1),
  lower = c(0,0,0,NA,NA,NA,NA,0,.1),
  upper = c(NA,NA,NA,NA,NA,NA,NA,NA,NA))
sigma.prior <- prior.p.dmc(
  dists = rep("beta", 9),
  p1    = c(A=1, B.r1=1, B.r2=1, t0=1, mean_v.f1.true=1, mean_v.f2.true=1, 
            mean_v.f1.false=1, mean_v.f2.false=1, sd_v.true = 1),
  p2    = rep(1,9))
pp.prior <- list(mu.prior, sigma.prior)

hsamples <- h.samples.dmc(nmc = 50, p.prior, data.model, pp.prior = pp.prior,
  thin = 1)

## Piping 
hsam0 <- ggdmc::run(hsamples, pm = .05, report = 10)


## Turn off migration. Default pm = 0
hsam1 <- ggdmc::run(h.samples.dmc(nmc = 50, p.prior, samples = hsam0, 
  pp.prior = pp.prior, thin = 1))
hsam2 <- ggdmc::run(h.samples.dmc(nmc = 50, p.prior, samples = hsam1,
   pp.prior = pp.prior, thin = 1))
hsam3 <- ggdmc::run(h.samples.dmc(nmc = 50, p.prior, samples = hsam2,
  pp.prior = pp.prior, thin = 1))

## Check whether MCMC converge
plot(hsam3, pll = TRUE)
plot(hsam3)


hest1 <- summary(hsam3, hyper = TRUE, recovery = TRUE, ps = pop.mean)
hest2 <- summary(hsam3, hyper = TRUE, recovery = TRUE, ps = pop.scale, type = 2)
est <- summary(hsam3)


###################
##   DDM         ##
###################
rm(list = ls())
setwd("~/Documents/DMCpaper")
source ("dmc/dmc.R")
load_model ("DDM", "ddm.R")
setwd("~/Documents/ggdmc_paper/")
## load("data/hierarchical/dmc_pipe_DDM.rda")
model <- model.dmc(
    p.map     = list(a = "1", v = "F", z = "1", d = "1", sz = "1", sv = "1",
                     t0 = "1", st0 = "1"),
  match.map = list(M = list(s1 = "r1", s2 = "r2")),
  factors   = list(S = c("s1", "s2"), F = c("f1", "f2")),
  constants = c(st0 = 0, d = 0),
  responses = c("r1", "r2"),
  type      = "rd")
  
## Population distribution
pop.mean <- c(a=2,  v.f1=4, v.f2=3, z=0.5, sz=0.3, sv=1, t0=0.3)
pop.scale <-c(a=0.5,v.f1=.5,v.f2=.5,z=0.1, sz=0.1, sv=.3,t0=0.05)
pop.prior <- prior.p.dmc(
   dists = rep("tnorm", 7),
   p1    = pop.mean,
   p2    = pop.scale,
   lower = c(0,-5, -5, 0, 0, 0, 0),
   upper = c(5, 7,  7, 1, 2, 1, 1) )

raw.data   <- h.simulate.dmc(model, p.prior = pop.prior, n = 32, ns = 8)
data.model <- data.model.dmc(raw.data, model)
ps <- attr(raw.data, "parameters")

p.prior <- prior.p.dmc(
  dists = c("tnorm","tnorm","tnorm","tnorm","tnorm","tnorm","tnorm"),
  p1=pop.mean,
  p2=pop.scale*5,
  lower=c(0,-5, -5, 0, 0, 0, 0),
  upper=c(5, 7,  7, 1, 2, 1, 1)
)

mu.prior <- prior.p.dmc(
  dists = c("tnorm","tnorm","tnorm","tnorm","tnorm","tnorm","tnorm"),
  p1=pop.mean,
  p2=pop.scale*5,
  lower=c(0,-5, -5, 0, 0, 0, 0),
  upper=c(5, 7,  7, 1, 2, 1, 1)
)
sigma.prior <- prior.p.dmc(
  dists = rep("beta", length(p.prior)),
  p1=c(a=1, v.f1=1,v.f2 = 1, z=1, sz=1, sv=1, t0=1),
  p2=c(1,1,1,1,1,1,1),
  upper=c(2,2,2,2,2,2,2)
)

pp.prior <- list(mu.prior, sigma.prior)
  
## Sampling  
hsam0 <- ggdmc::run(h.samples.dmc(nmc = 512, p.prior = p.prior, 
      pp.prior = pp.prior, data = data.model, thin = 2), pm = .1)

hsam1 <- ggdmc::run(h.samples.dmc(nmc = 512, p.prior = p.prior, 
      pp.prior = pp.prior, samples = hsam0, thin = 32), pm = .1)

hsam2 <- ggdmc::run(h.samples.dmc(nmc = 512, p.prior = p.prior, 
      pp.prior = pp.prior, samples = hsam1, thin = 128), pm = .3)

hsam3 <- ggdmc::run(h.samples.dmc(nmc = 512, p.prior = p.prior, 
      pp.prior = pp.prior, samples = hsam2, thin = 2))

How to fit fixed-effect model with multiple participants

require(ggdmc)
## Model Setup----------
model <- BuildModel(p.map = list( A = "1", B = "R", t0 = "1",
   mean_v = c("F", "M"), sd_v = "M", st0 = "1"),
   match.map = list(M = list(s1 = 1, s2 = 2)),
   factors = list(S = c("s1", "s2"), F = c("f1", "f2")),
   constants = c(sd_v.false = 1, st0 = 0),
   responses = c("r1", "r2"),
   type = "norm")
   
npar <- length(model)

## Population distribution, rate effect on F
pop.mean <- c(A = .4, B.r1 = .6, B.r2 = .8, t0 = .3, mean_v.f1.true = 1.5,
  mean_v.f2.true = 1, mean_v.f1.false = 0, mean_v.f2.false = 0, sd_v.true = .25)
pop.scale <-c(A = .1, B.r1 = .1, B.r2 = .1, t0 = .05, mean_v.f1.true = .2,
  mean_v.f2.true = .2, mean_v.f1.false = .2, mean_v.f2.false = .2, sd_v.true = .1)
pop.prior <- BuildPrior(
  dists = rep("tnorm", npar),
  p1    = pop.mean,
  p2    = pop.scale,
  lower = c(0,   0,  0, NA, NA, NA, NA, 0, .1),
  upper = c(NA, NA, NA, NA, NA, NA, NA, NA, 1))

## Simulate some data
nsubject <- 8
ntrial <- 1e2
dat <- simulate(model, nsim = ntrial, nsub = nsubject, prior = pop.prior)
dmi <- BuildDMI(dat, model)
ps  <- attr(dat, "parameters")

# round( matrixStats::colMeans2(ps), 2)
# round( matrixStats::colSds(ps), 2)

## FIT FIXED-EFFECT MODEL----------
## Use all truncated normal priors for locations
p.prior <- BuildPrior(
  dists = rep("tnorm", npar),
  p1    = pop.mean,
  p2    = pop.scale * 5,
  lower = c(0,   0,  0, NA, NA, NA, NA,  0, .1),
  upper = c(NA, NA, NA, NA, NA, NA, NA, NA, NA))

thin <- 8
nchain <- npar * 3
migrationRate <- .2
sam <- run(StartManynewsamples(512, dmi, p.prior, thin, nchain),
  pm = migrationRate, ncore = 8)
sam <- run(RestartManysamples(512, sam), pm = migrationRate, ncore = 8)
sam <- run(RestartManysamples(512, sam, thin = 16), pm = migrationRate, ncore = 8)
sam <- run(RestartManysamples(512, sam, thin = 64), pm = migrationRate, ncore = 8)
sam <- run(RestartManysamples(512, sam, thin = 16), pm = migrationRate, ncore = 8)

plot(sam[[6]])
plot(sam)  ## This will take a while, because ploting many participants

How to conduct automatic convergence checks

One challenge in Bayesian modeling is to make sure posterior samples are from target distribuitons. When using the DE-MCMC sampler to fit very complex models, some regions in parameter spaces might be difficult to handle. Here we provide one method to conduct automatic sampling by repeatedly fitting models until posterior samples are proper.

First, we convert the first stage samples (i.e., hsam0) to a generic object, hsam. Then, we use the repeat function to iterate model fits. Meanwhile, we use hgelman to check whether both the PSRFs at the hyper- and data-level are smaller than 1.1, suggesting that chains are well mixed.

thin <- 2
hsam <- hsam0
counter <- 1
repeat {
  hsam <- run(RestartHypersamples(5e2, hsam, thin = thin),
    pm = .3, hpm = .3)
  save(hsam, file = "data/tmp_posterior_samples.rda")
  rhats <- hgelman(hsam)
  counter <- counter + 1
  thin <- thin * 2
  if (all(rhats < 1.1) || counter > 1e2) break
}


List of models currently hard-wired in ggdmc

  1. The LBA model, type = "norm",
  2. The DDM, type = "rd",
  3. The Piecewise LBA model 0; CPU-based PDA likelihoods; type = "plba0",
  4. The Piecewise LBA model 1; CPU-based PDA likelihoods; type = "plba1",
  5. The Piecewise LBA model 0; GPU-based PDA likelihoods; type = "plba0_gpu",
  6. The Piecewise LBA model 1; GPU-based PDA likelihoods; type = "plba1_gpu",
  7. The LBA model; GPU-based PDA likelihoods;, type = "norm_pda_gpu",
  8. The correlated accumualtor model; type = "cnorm".

For the details regarding PLBA types, please see Holmes, Trueblood, and Heathcote (2016)

Further information

Please see my tutorials site, Cognitive Model, for more details.

Prerequisites

  • R (>= 3.4.0)
  • Rcpp (>= 0.12.10), RcppArmadillo (>= 0.7.700.3.0), ggplot2 (>= 2.1.0), rtdists (>= 0.6-6), ggmcmc (>= 0.7.3), coda (>= 0.16-1) tmvtnorm, matrixStats, data.table
  • Windows users need Rtools (>= 3.3.0.1959)
  • Mac OS users need to make clang understand OpenMP flag
  • Linux/Unix users may need to install Open MPI library, if it has not been installed.
  • Armadillo requires a recent compiler; for the g++ family at least version 4.6.*

Installation

CRAN method is currently unavailable. The latest version of the package is still in the process of CRAN standard software checks.

~~From CRAN: ~~ > install.packages("ggdmc")

From source:

From GitHub (you will need to install devtools:

devtools::install_github(“yxlin/ggdmc”)

For Mac Users:

  1. You will need to install gfortran. As to 27, Aug, 2018, please install gfortran 6.1, even you are using a mscOS High Sierra Version 10.13.4. gfortran 6.3 may not work.

  2. Then install clang4-r. James Balamuta has created a convenient tool, clang4-r. Once you install clang4-r, your clang will then understand the OpenMP flag in ggdmc. You may use other methods he suggests to allow Mac to understand OpenMP flag, but they are usually less straightforward.

Please visit his site for detailed explanations about the clang-OpenMP issue.

Citation

If you use this package, please cite the software, for example:

Lin, Y.-S., & Heathcote, A. Population-based Monte Carlo is as Effective as Hamiltonian Monte Carlo in Fitting High Dimension Cognitive Model. Manuscript in preparation. Manuscript in preparation. Retrieved from https://github.com/yxlin/ggdmc

Lin, Y.-S., Heathcote, A., & Holmes, W. R (submitted). Parallel Probability Density Approximation. Behavior Research Methods.

Lin, Y.-S., Strickland, L., Reynold, A., & Heathcote, A. (manuscript in preparation). Tutorial on Bayesian cognitive modeling.

Contributors

The R documentation, tutorials, C++ codes, R helper functions and pacakging are developed by Yi-Shin Lin. DMC is developed by Andrew Heathcote (Heathcote et al., 2018). Please report bugs to me.

License

GPL-2

Acknowledgments

  • Early version of density.cpp is based on Voss & Voss's (2012) density.c in fast-dm 30.2.
  • Early version of the truncated normal distributions is based on Jonathan Olmsted's [email protected] RcppTN 0.1-8 (https://github.com/olmjo/RcppTN) and Christopher Jackson's [email protected] R codes in msm package.
  • Armadillo is a collection of C++ library for performing linear algebra http://arma.sourceforge.net/, authored by Conrad Sanderson.
  • Thanks to Matthew Gretton's consulation about rtdists.
  • Thanks to Andrew Heathcote for lending MacBook Air for facilitating tests of ggdmc on OS X (mscOS High Sierra Version 10.13.4)

News

0.2.5.2 Yi-Shin Lin [email protected]

  • Add random shuffle across all crossover operator
  • allow posdrift switch passing from BuildModel

0.2.3.6 Yi-Shin Lin [email protected]

  • Remove OpenMP feature in PLBA model

0.2.3.6 Yi-Shin Lin [email protected]

  • Rewrite documentations for all functions
  • Change function naming convention
  • add -lgomp flag
  • Use simple S3 method to resolve the issue of ".dmc" usage in DMC

0.1.9.7 Yi-Shin Lin [email protected]

- Fix hierarchical Bayesian for LBA model and DDM

0.1.5 Yi-Shin Lin [email protected]

- Add LBA PDF, prior C++ functions
- Reverse back to R crossover, migrate, run, samples etc functions to make it
very compatible to DMC

0.1.3.9 Yi-Shin Lin [email protected]

- Fix more 'overloading ambiguity' errors in tnorm_wrapper.cpp

0.1.3.8 Yi-Shin Lin [email protected]

- Add init.c to register native routines.

0.1.3.7 Yi-Shin Lin [email protected]

- Update to Rcpp 0.12.10 and RcppArmadillo >= 0.7.700.0.0 to resolve
  'Rcpp::Nullable' problem.

0.1.3.6 Yi-Shin Lin [email protected]

- Correct 'overloading ambiguity' installation failures on Solaris
- Update the link in README.md to canonical url,
'https://CRAN.R-project.org/package=RcppArmadillo'

0.1.3.5 Yi-Shin Lin [email protected]

- First release to the public

0.1.3.4 Yi-Shin Lin [email protected]

- improve ddmc.Rd, dmc.Rd, dprior.Rd.
- add dmc.R to generate ggdmc-package.Rd
- merge DMC's and ggdmc's plot.prior and rename drawPrior as plot.priors
- now ggdmc has its own classes dmc, dmc.list and hyper
- ggdmc now uses S3 class to dispatch different summary and plot functions.
- Finally, LBA model now is in ggdmc. likelihood.dmc and transform.dmc branch
out to e.g., logLik.ddm, logLik.lba_B, becasue two new classes, ddm and
lba_B are added.
- New modules ggdmc_class.R and ggdmc_eam.R are added. The former defines
DMC classes and the latter collects evidence accumulator models.
- Add more help pages.
- profile.dmc merge with stats's profile
- roxygen2 now is also in charge of NAMESPACE

0.1.3.0 Yi-Shin Lin [email protected]

- plot.dmc add function to superimpose prior distribution on top of
posterior distribution (ggplot2).
- DMC needs to call modified coda to use this function (base).

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("ggdmc")

0.2.5.8 by Yi-Shin Lin, 19 days ago


https://github.com/yxlin/ggdmc


Report a bug at https://github.com/yxlin/ggdmc/issues


Browse source code at https://github.com/cran/ggdmc


Authors: Yi-Shin Lin [aut, cre] , Andrew Heathcote [aut]


Documentation:   PDF Manual  


GPL-2 license


Imports Rcpp, coda, stats, utils, ggplot2, matrixStats, data.table

Suggests testthat

Linking to Rcpp, RcppArmadillo


See at CRAN