Variable Selection for Model-Based Clustering of Mixed-Type Data Set with Missing Values

Full model selection (detection of the relevant features and estimation of the number of clusters) for model-based clustering (see reference here ). Data to analyze can be continuous, categorical, integer or mixed. Moreover, missing values can occur and do not necessitate any pre-processing. Shiny application permits an easy interpretation of the results.


Version 2.1.2 (2018-08) o Correction of English typos

Version 2.1.2 (2018-03) o Plot function has default arguments o Extraction functions predict(), coef(), fitted(), AIC(), BIC() and MICL() are added. o print() and summary() methods have been modified.

Version 2.1.1 (2018-03) o ICL can be used if there is no variable selection o Input data can be stored in a matrix (then, Gaussian mixture is considered) o Bug fixed: AIC is now computed if only one component is considered o Reference to Journal of Classification for Marbac, M. and Patin, E. and Sedki, M. (2018)

Version 2.1 (2018-03) o Interface Shiny is available to analyze the clustering results (use function VarSelShiny on the clustering output). o Modification of the class returned by the clustering function (results obtained by old version of VarSelLCM need to be convert by using the function VarSelConvert). o Modification of the plot functions (ggplot2). o Imputation fonction permits the imputation of missing values based on the model parameters (use function VarSelImputation).

Version 2.0 (2017-10) o Mixed-data are allowed (continuous, integer and categorical). o Missing values can be managed by assuming that values are missing completly at random. o Model selection can be done with BIC (optimized by a specific EM algorithm).

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


2.1.3 by Mohammed Sedki, 5 months ago

Browse source code at

Authors: Matthieu Marbac and Mohammed Sedki

Documentation:   PDF Manual  

Task views: Cluster Analysis & Finite Mixture Models, Missing Data

GPL (>= 2) license

Imports methods, Rcpp, parallel, mgcv, ggplot2, shiny

Suggests knitr, rmarkdown, dplyr, htmltools, scales, plyr

Linking to Rcpp, RcppArmadillo

Imported by ClusVis.

See at CRAN