Build Dirichlet Process Objects for Bayesian Modelling

Perform nonparametric Bayesian analysis using Dirichlet processes without the need to program the inference algorithms. Utilise included pre-built models or specify custom models and allow the 'dirichletprocess' package to handle the Markov chain Monte Carlo sampling. Our Dirichlet process objects can act as building blocks for a variety of statistical models including and not limited to: density estimation, clustering and prior distributions in hierarchical models. See Teh, Y. W. (2011) < https://www.stats.ox.ac.uk/~teh/research/npbayes/Teh2010a.pdf>, among many other sources.


Travis-CI Build Status AppVeyor Build Status Coverage Status

The dirichletprocess package provides tools for you to build custom Dirichlet process mixture models. You can use the pre-built Normal/Weibull/Beta distributions or create your own following the instructions in the vignette. In as little as four lines of code you can be modelling your data nonparametrically.

Installation

You can install the stable release of dirichletprocess from CRAN:

install.packages("dirichletprocess")

You can also install the development build of dirichletprocess from github with:

devtools::install_github("dm13450/dirichletprocess")

For a full guide to the package and its capabilities please consult the vignette:

browseVignettes(package = "dirichletprocess")

Examples

Density Estimation

Dirichlet processes can be used for nonparametric density estimation.

faithfulTransformed <- faithful$waiting - mean(faithful$waiting)
faithfulTransformed <- faithfulTransformed/sd(faithful$waiting)
dp <- DirichletProcessGaussian(faithfulTransformed)
dp <- Fit(dp, 100, progressBar = FALSE)
plot(dp)

data.frame(Weight=dp$weights, Mean=c(dp$clusterParameters[[1]]), SD=c(dp$clusterParameters[[1]]))
#>        Weight       Mean         SD
#> 1 0.371323529 -1.1756510 -1.1756510
#> 2 0.625000000  0.6597522  0.6597522
#> 3 0.003676471  0.1061095  0.1061095

Clustering

Dirichlet processes can also be used to cluster data based on their common distribution parameters.

faithfulTrans <- as.matrix(apply(faithful, 2, function(x) (x-mean(x))/sd(x)))
dpCluster <-  DirichletProcessMvnormal(faithfulTrans)
dpCluster <- Fit(dpCluster, 1000, progressBar = FALSE)

To plot the results we take the cluster labels contained in the dp object and assign them a colour

For more detailed explanations and examples see the vignette.

News

dirichletprocess 0.2.1

  • Added AppVeyor, Travis-CI and codecov.io badges.
  • Added penalised log-likelihood step for posterior cluster parameter inference.
  • Added exponential mixture model DirichletProcessExponential.
  • Updated plot. Multivariate Gaussian models can now be plotted.
  • Various bug fixes.
  • Updated description.

dirichletprocess 0.2.0

  • First public release.
  • Added a NEWS.md file to track changes to the package.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("dirichletprocess")

0.2.2 by Dean Markwick, 2 months ago


https://github.com/dm13450/dirichletprocess


Report a bug at https://github.com/dm13450/dirichletprocess/issues


Browse source code at https://github.com/cran/dirichletprocess


Authors: Gordon J. Ross [aut] , Dean Markwick [aut, cre] , Kees Mulder [ctb]


Documentation:   PDF Manual  


GPL-3 license


Imports gtools, ggplot2, mvtnorm

Suggests testthat, knitr, rmarkdown, tidyr, dplyr


See at CRAN