Dependent Mixture Models - Hidden Markov Models of GLMs and Other Distributions in S4

Fits latent (hidden) Markov models on mixed categorical and continuous (time series) data, otherwise known as dependent mixture models, see Visser & Speekenbrink (2010, ).


Changes in depmixS4 version 1.3-5

o GLM responses now allow missing values in covariates, as long as the response variable (dependent variable) is also missing at those positions

o If the likelihood decreases on an interation in the EM algorithm, the fitted model is returned with a warning, rather than producing an error (requested by Arthur Bossavy).

o Removed the meta-data from the depmixS4-package help page as that tends to get out of sync with new releases (and adds little information).

o Added documentation for makeMix (following a request by Ivan Jimenez; the function has not been tested extensively).

Changes in depmixS4 version 1.3-4

o Fixed a bug in which produced an error for a poisson response when there were missing values.

o Added an example on ?fit to get posterior state sequences for new or 'out-of-sample' data.

Changes in depmixS4 version 1.3-3

o The underlying code for the forwardbackward function was changed from extern 'C' in a c++ environment to plain C (the c++ definitions were there for historical reasons but no longer necessary).

o Added support for fitting linear inequality constraints using solnp optimization (thanks to Romann Weber for noticing this support was missing).

o Added factor level names to summary of multinomial() models to ease interpretation.

o Fixed bug in EM with "hard" classification for mixture models (did not work with random starting values).

o In EM with "hard" classification, when a component/state is empty (has no assignments), previous parameter values are now retained.

o Improved error message when optimization method is wrongly specified in fit function.

o Fixed a bug in the summary method for fitted models; in some cases parameters of the prior model were printed in the transition matrix; thanks to Verena Schmittmann for catching this bug (which was introduced in 1.3-0 along with compact argument in summary's of fitted models).

Changes in depmixS4 version 1.3-2

o Removed partial matching argument in call to check.attributes for compatibility with R release 3.1.0.

Changes in depmixS4 version 1.3-1

Major changes

o Fixed a bug in the fit function of (dep-)mix objects; upper and lower bounds of general linear contraints were not passed to the optimization routines when any of the bounds was zero; in that case, all bounds were set to zero. Thanks to Vincent Miele for bringing this to my attention.

Minor changes

o Added a reference to Giudici et al (2000) in the vignette in the section on likelihood ratio testing which fits better with the procedure that is implemented. Thanks to Jan Bulla for probing us about the vignette text there.

o Added a conditional to set the log likelihood of a model to a value just above the starting value (i.e. the values before iterations start) when the logl functions returns Infinite (which may sometimes occur when parameters are set outside their boundaries).

Changes in depmixS4 version 1.3-0

Major changes

o The EM algorithm has gained an extra argument 'classification', passed from the 'fit' function using argument 'emcontrol', to allow a choice between maximising the regular (classification="soft", the default) or classification (classification="hard") likelihood. WARNING: using the classification likelihood combined with random starting values may easily lead to unstable results; use with caution.

o Parameters are now given proper names, following the glm() scheme (e.g. '(Intercept)', 'x1', et cetera); with this, the show and summary methods have changed considerably and now produce more compact and more readable output. The summary method (for both fitted and unfitted models) now has an argument 'compact' (TRUE by default) that controls the presentation of the repsonse model parameters. The prior and transition models are now presented more compactly when there are no covariates.

o The fit function has gained two arguments: solnpcntrl and donlpcntrl which can be used to fine tune optimization using packages Rsolnp and Rdonlp2; the latter is not on CRAN, and the version that depmixS4 is compatible with is from r-forge (note the licence from the package).

o The EM algorithm is now more memory efficient and hence faster, most notably for large models and/or data; thanks to Robert McGehee for generously providing these code enhancements.

Minor changes

o Binomial models now treat factors in the same way as glm(); that is the first level of a factor is treated as a failure, and the remaining levels as successes.

o Un-fitted models now also have a summary method, which is identical to the show method for these models.

o Added function getmodel(object) to select submodels of a full mix or depmix model, eg to use in deriving predictions, getting at parameter values, et cetera.

o Small parameter values in multinomial response models and the initial probabilities and transition models (with identity link) are now set to zero to speed-up convergence. Current threshold is 1e-6.

Changes in depmixS4 version 1.2-2

o Changed class assignment of depmix.fitted object using as().

o Fixed a bug in the fit method of depmix models: linear inequality constraints were not passed on to rsolnp (thanks to Peiming Wang for bringing this to my attention).

Changes in depmixS4 version 1.2-1

o Fixed a bug in handling of missing values for mix models

Changes in depmixS4 version 1.2-0

Major changes

o Missing values for responses are now allowed. Note that missing values in covariates will cause errors.

Changes in depmixS4 version 1.1-0

Major changes

o The main loop computing the forward and backward variables for use in the EM algorithm is now implemented in C. Depending on model specifics this results in a 2-4 fold speed increase when fitting models.

o The Changes for each release (in the NEWS file) is now split into two sections: Major and Minor changes.

o Added several examples on the ?responses page (Poisson change point model, similar model with a single multinomial response) and the ?depmix page (model for S&P 500 returns; thanks to Chen Haibo for sending this).

Minor changes

o Corrected a typo in the vignette in Equation 1; the first occurrence of S read S_t instead of S_1 (thanks to Peng Yu for reporting this).

o Added a sensible error message when the data contains missings (depmixS4 can not handle missing data yet).

o Fixed a bug in the relative stopping criterion for EM (which resulted in immediate indication of convergence for positive log likelihoods; thanks to Chen Haibo for sending the S&P 500 example which brought out this problem).

o Function forwardbackward now has a useC argument to determine whether C code is used, the default, or not (the R code is mostly left in place for easy debugging).

o Added a fix for models without covariates/intercepts. In responseGLM and responseMVN the function setpars now exits when length(value) == 0. In setpars.depmix, a check is added whether npar > 0.

Changes in depmixS4 version 1.0-4

o Added examples of the use of ntimes argument on ?depmix and ?fit help pages using the ?speed data (which now has the full reference to the accompanying publication).

o Using nobs generic from stats/stats4 rather than defining them anew (which gave clashes with other packages that did the same).

o Fixed a bug in simulation of gaussian response model, which was returning NaNs due to an error in assignment of the sd parameter (introduced in version x). Thanks to Jeffrey Arnold for reporting this (bug #1365).

Changes in depmixS4 version 1.0-3

o Using AIC/BIC/logLik generics from stats/stats4 rather than defining them anew (which gave clashes with other packages that did the same).

Changes in depmixS4 version 1.0-2

o fixed a bug in simulation of binomial response model data (the response consists of the number of successes, and the number of failures; in simulation, the number of failures was an exact copy of the number of successes).

o added a meaningful error message in the EM algorithm for lca/mixture models in case the initial log likelihood is NA (thanks to Matthias Ihrke for pointing this out).

Changes in depmixS4 version 1.0-1

o minor changes in documentation to conform to R 2.12.0 standards.

o fixed a bug concerning random start values (the argument to specify this was not passed to the EM algorithm and hence was completely ineffective ...).

o changed the emcontrol argument to the fit function; it now calls a function em.control which returns the list of control parameters, which now also includes maxit, the max number of iterations of the EM algorithm. This makes future additions to EM control parameters easier.

o random parameter initialization is now the default when using EM to fit models.

o fixed a bug in multinomial models with n>1; the parameters are now normalized such that they sum to unity (this bug was introduced in version 0.9-0 in multinomial models with identity link).

o added an error message for multinomial response models with n>1 and link='mlogit' as this case is not handled; n>1 multinomial can use the 'identity' link function.

Changes in depmixS4 version 1.0-0

o added a vignette to the package and upped the version number 1.0-0 to celebrate publication in the Journal of Statistical Software.

Changes in depmixS4 version 0.9-1

o fixed a bug in setting the lower and upper bounds for GLMresponse models (the number of bounds was wrong for models with covariates/ predictors; these bounds are only used in constrained optimization in which case they produced an error immediately; in EM optimization these bounds are not used).

Changes in depmixS4 version 0.9-0

o added optimization using Rsolnp, which can be invoked by using method="rsolnp" in calling fit on (dep-)mix objects. Note that this is meant for fitting models with additional constraints. Method="rsolnp" is now the default when fitting constrained models, method="donlp" is still supported.

o added documentation for control arguments that can be passed to em algorithm, particularly for controlling the tolerance in optimization.

o added multinomial models with identity link for transition and prior probabilities. These are now the default when no covariates are present.

o added bounds and constraints for multinomial identity models such that these constraints are satisfied when fitting models with method="rsolnp" or "donlp". Also, variance and sd parameters in gaussian and multivariate normal models are given bounds to prevent warnings and errors in optimization of such models using rsolnp or donlp.

o added option to generate starting values as part of the EM algorithm.

o fixed a bug in multinomial response models with n>1; the response for these models can be specified as a k-column matrix with the number of observed responses for each category; the loglikelihood for these models in which there was more than 1 observation per row was incorrect; note that such models may lead to some numerical instabilities when n is large.

Changes in depmixS4 version 0.3-0

o added multinomial response function with identity link (no covariates allowed in such a model); useful when (many) boundary values occur; currently no constraints are used for such models, and hence only EM can be used for optimization, or alternatively, if and when Rdonlp2 is used, sum constraints need to be added when fitting the model. See ?GLMresponse for details.

o added an example of how to specify a model with multivariate normal responses (and fixed a bug in MVNresponse that prevented such models from being specified in the first place). See ?makeDepmix for an example.

Changes in depmixS4 version 0.2-2

o fixed a warning produced when specifying conrows.upper and .lower in the fit function

o added error message in case the initial log likelihood is infeasible

o fixed a bug in the fit function for multinomial response models with covariates (thanks to Gilles Dutilh for spotting this)

Changes in depmixS4 version 0.2-1

o fixed a bug in the Viterbi algorithm used to compute posterior states (this bug was introduced in version 0.2-0)

o restructured test files somewhat

o fixed a bug in the use of the conrows argument in the fit function (a missing drop=FALSE statement)

o updated help files for mix classes

o fixed a bug in setting the starting values of regression coefficients in prior and transInit models with covariates (thanks to Verena Schmittmann for reporting this)

o added newx argument to predict function of transInit objects, to be used for predicting probabilities depending on covariates (useful in eg plotting transition probabilities as function of a covariate)

o added example of the use of conrows argument in fitting functions and other minor updates in documentation

Changes in depmixS4 version 0.2-0

o restructured R and Rd (help) files; added depmixS4 help with a short overview of the package and links to appropriate help files

o added function 'simulate' to generate new data from a (fitted) model

o added function 'forwardbackward' to access the forward and backward variables as well as the smoothed transition and state variables

o added new glm distributions: gamma, poisson

o added multivariate normal distribution

o freepars now works correctly on both depmix and depmix.fitted objects

o added function 'nlin' to compute the number of linear constraints in a fitted model object

o added mix class for mixture and latent class models; the depmix class extends this mix class and adds a transition model to it

o added help file for makeDepmix to provide full control in specifying models

o minor changes to make depmixS4 compatible with R 2.7.1

Changes in depmixS4 version 0.1-1

o adjusted for R 2.7.0

First version released on CRAN: 0.1-0

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


1.5-0 by Ingmar Visser, 8 months ago

Browse source code at

Authors: Ingmar Visser <[email protected]> , Maarten Speekenbrink <[email protected]>

Documentation:   PDF Manual  

Task views: Cluster Analysis & Finite Mixture Models, Time Series Analysis

GPL (>= 2) license

Imports stats, stats4, methods

Depends on nnet, MASS, Rsolnp, nlme

Suggests gamlss, gamlss.dist, Rdonlp2

Depended on by hmmr.

Suggested by ldhmm, plotHMM, segclust2d.

See at CRAN