Implements methods for clustering mixed-type data,
specifically combinations of continuous and nominal data. Special attention
is paid to the often-overlooked problem of equitably balancing the
contribution of the continuous and categorical variables. This package
implements KAMILA clustering, a novel method for clustering
mixed-type data in the spirit of k-means clustering. It does not require
dummy coding of variables, and is efficient enough to scale to rather large
data sets. Also implemented is Modha-Spangler clustering, which uses a
brute-force strategy to maximize the cluster separation simultaneously in the
continuous and categorical variables. For more information, see Foss, Markatou,
Ray, & Heching (2016)
R package for clustering mixed data. For more information, install the package and run
library(kamila)
?`kamila-package`
from the R terminal. For an in-depth discussion of the challenges involved in clustering mixed-type data, please see our papers: