Clustering via parsimonious Gaussian Mixtures of Experts using the MoEClust models introduced by Murphy and Murphy (2020)
Fits MoEClust models introduced by Murphy and Murphy (2017) <arXiv:1711.05632>, i.e. fits finite Gaussian mixture of experts models with gating and/or expert network covariates supplied via formula interfaces using a range of parsimonious covariance parameterisations via the EM/CEM algorithm. Also visualises Gaussian mixture of experts models with parsimonious covariance structures using generalised pairs plots.
The most important function in the MoEClust package is: MoE_clust
, for fitting the model via EM/CEM with gating and/or expert network covariates, supplied via formula interfaces. Other functions also exist, e.g. MoE_control
, MoE_crit
, MoE_dens
, MoE_estep
, and aitken
, which are all used within MoE_clust
but are nonetheless made available for standalone use. MoE_compare
is provided for conducting model selection between different results from MoE_clust
using different covariate combinations &/or initialisation strategies, etc.
A dedicated plotting function exists for visualising the results using generalised pairs plots, for examining the gating network, and/or log-likelihood, and/or clustering uncertainties, and/or graphing model selection criteria values. The generalised pairs plots (MoE_gpairs
) visualise all pairwise relationships between clustered response variables and associated continuous, categorical, and/or ordinal covariates in the gating &/or expert networks, coloured according to the MAP classification, and also give the marginal distributions of each variable (incl. the covariates) along the diagonal.
An as.Mclust
method is provided to coerce the output of class "MoEClust"
from MoE_clust
to the "Mclust"
class, to facilitate use of plotting and other functions for the "Mclust"
class within the mclust package. As per mclust, MoEClust also facilitates modelling with an additional noise component (with or without the mixing proportion for the noise component depending on covariates). Finally, a predict
method is provided for predicting the fitted response and probability of cluster membership (and by extension the MAP classification) for new data, in the form of new covariates and new response data, or new covariates only.
The package also contains two data sets: ais
and CO2data
.
You can install the latest stable official release of the MoEClust
package from CRAN:
install.packages("MoEClust")
or the development version from GitHub:
# If required install devtools:
# install.packages('devtools')
devtools::install_github('Keefe-Murphy/MoEClust')
In either case, you can then explore the package with:
library(MoEClust)
help(MoE_clust) # Help on the main modelling function
For a more thorough intro, the vignette document is available as follows:
vignette("MoEClust", package="MoEClust")
However, if the package is installed from GitHub the vignette is not automatically created. It can be accessed when installing from GitHub with the code:
devtools::install_github('Keefe-Murphy/MoEClust', build_vignettes = TRUE)
Alternatively, the vignette is available on the package's CRAN page.
K. Murphy and T. B. Murphy (2017). Parsimonious Model-Based Clustering with Covariates. To appear. <arXiv:1711.05632>
MoE_control
arg. algo
allows model fitting using the "EM"
or "CEM"
algorithm:
MoE_cstep
added.algo
option "cemEM"
allows running EM starting from convergence of CEM.LOGLIK
to MoE_clust
output, giving maximal log-likelihood values for all fitted models.
DF/ITERS
, etc., with associated printing/plotting functions.MoE_compare
, summary.MoEClust
, and MoE_plotCrit
accordingly.MoE_control
arg. nstarts
allows for multiple random starts when init.z="random"
.MoE_control
arg. tau0
provides another means of initialising the noise component.clustMD
is invoked for initialisation, models are now run more quickly in parallel.gating
and expert
formulas without intercept terms (drop_constants
also edited).MoE_plotGate
now allows a user-specified x-axis against which mixing proportions are plotted.predict.MoEClust
function added: predicts cluster membership probability,noise.gate
option) accounted for.MoE_Uncertainty
added (callable within plot.MoEClust
):response.type="density"
to MoE_gpairs
now works properly for models withclustMD
package to Suggests:
. New MoE_control
argument exp.init$clustMD
isTRUE(exp.init$joint)
& clustMD
is loaded (defaults to FALSE
, works with noise).drop.break
arg. to MoE_control
for further control over the extra initialisationMoE_dens
for the EEE
& VVV
models by using already available Cholesky factors.MoE_control
arguments:
km.args
specifies kstarts
& kiters
when init.z="kmeans"
.init.z="hc"
& noise into hc.args
& noise.args
.hc.args
now also passed to call to mclust
when init.z="mclust"
.init.crit
("bic"
/"icl"
) controls selection of optimal mclust
/clustMD
init.z="mclust"
or isTRUE(exp.init$clustMD)
);init.z="mclust"
.ITERS
replaces iters
as the matrix of the number of EM iterations in MoE_clust
output:
iters
now gives this number for the optimal model.
ITERS
now behaves like BIC
/ICL
etc. in inheriting the "MoECriterion"
class.iters
now filters down to summary.MoEClust
and the associated printing function.ITERS
now filters down to MoE_compare
and the associated printing function.response.type="uncertainty"
MoE_gpairs
to better conform to mclust
: previously no transparency.subset
arg. to MoE_gpairs
now allows data.ind=0
or cov.ind=0
, allowing plotting ofMoE_gpairs
plots.sigs
arg. to MoE_dens
and MoE_estep
must now be a variance object, as per variance
MoE_clust
& mclust
output, the number of clusters G
,d
& modelName
is inferred from this object: the arg. modelName
was removed.MoE_clust
no longer returns an error if init.z="mclust"
when no gating/expert networkinit.z="hc"
is used to better reproduce mclust
output.resid.data
now returned by MoE_clust
as a list, to better conform to MoE_dens
.MoE_aitken
& MoE_qclass
to aitken
& quant_clust
, respectively.data
w/ missing values now dropped for gating/expert covariates too (MoE_clust
).linf
within aitken
& the associated stopping criterion.linf
estimate now returned for optimal model when stopping="aitken"
& G > 1.resid
& residuals
args. to as.Mclust
& MoE_gpairs
.MoE_plotCrit
, MoE_plotGate
& MoE_plotLogLik
now invisibly return revelant quantities.G=0
models when noise.init
is not supplied.drop_levels
to handle alphanumeric variable names and ordinal variables.MoE_compare
when a mix of models with and without a noise component are supplied.MoE_compare
when optimal model has to be re-fit due to mismatched criterion
.MoE_Uncertainty
plots.print.MoECompare
now has a digits
arg. to control rounding of printed output.MoE_clust
& MoE_compare
.drop_constants
.is.list(x)
with inherits(x, "list")
for stricter checking.MoE_clust
.mclust::clustCombi/clustCombiOptim
examples to as.Mclust
documentation.MoE_news
for accessing this NEWS
file.G
is at either end of the range considered.cat
/message
/warning
calls for printing clarity.usage
sections of multi-argument functions.MoEClust-package
help file (formerly just MoEClust
).MoE_control
gains the noise.gate
argument (defaults to TRUE
): when FALSE
,x$parameters$mean
is now reported as the posterior mean of the fitted values whenMoE_gpairs
plots when there are expert covariates.expert_covar
used to account for variability in the means, in the presenceMoE_control
gains the hcUse
argument (defaults to "VARS"
as per old mclust
versions).MoE_mahala
gains the squared
argument + speedup/matrix-inversion improvements.matrixStats
(on which MoEClust
already depended).MoE_gpairs
argument addEllipses
gains the option "both"
.equalPro=TRUE
in the presence of a noise component when there areMoE_gpairs
argument scatter.type
gains the options lm2
& ci2
for further controllm
& ci
type plots were beingMoE_mahala
and in expert network estimation with a noise component.G=0
models w/ noise component only can now be fitted without having to supply noise.init
.MoE_compare
now correctly prints noise information for sub-optimal models.stopping="relative"
: now conforms to mclust
.check.margin=FALSE
to calls to sweep()
.call.=FALSE
to all stop()
messages.grid
library.