Genetic Approach to Maximize Clustering Criterion

An evolutionary approach to performing hard partitional clustering. The algorithm uses genetic operators guided by information about the quality of individual partitions. The method looks for the best barycenters/centroids configuration (encoded as real-value) to maximize or minimize one of the given clustering validation criteria: Silhouette, Dunn Index, C-Index or Calinski-Harabasz Index. As many other clustering algorithms, 'gama' asks for k: a fixed a priori established number of partitions. If the user does not know the best value for k, the algorithm estimates it by using one of two user-specified options: minimum or broad. The first method uses an approximation of the second derivative of a set of points to automatically detect the maximum curvature (the 'elbow') in the within-cluster sum of squares error (WCSSE) graph. The second method estimates the best k value through majority voting of 24 indices. One of the major advantages of 'gama' is to introduce a bias to detect partitions which attend a particular criterion. References: Scrucca, L. (2013) ; CHARRAD, Malika et al. (2014) ; Tsagris M, Papadakis M. (2018) ; Kaufman, L., & Rousseeuw, P. (1990, ISBN:0-47 1-73578-7).


We presented an R package to perform hard partitional clustering guided by an user-specified cluster validation criterion. The algorithm obtains high cluster validation indices when applied to datasets who contains superellipsoid clusters. The algorithm is capable of estimate the number of partitions for a given dataset by an automatic inference of the elbow in WCSSE graph or by using a broad search in 24 cluster validation criteria. The package brings six different built-in datasets for experimentation, two of them are in-house datasets collected from real execution of distributed machine learning algorithms on Spark clusters. The others are well-known datasets used in the benchmark of clustering problems.

News

Version 1.0.3 (2019-02)

  • First release on CRAN.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("gama")

1.0.3 by Jairson Rodrigues, 2 months ago


https://github.com/jairsonrodrigues/gama


Browse source code at https://github.com/cran/gama


Authors: Jairson Rodrigues [aut, cre] , Germano Vasconcelos [aut, ths] , Renato Tin'{o}s [aut, rev]


Documentation:   PDF Manual  


GPL (>= 2) license


Imports ArgumentCheck, cluster, clusterCrit, NbClust, GA, ggplot2, methods, Rfast

Suggests knitr, rmarkdown


See at CRAN