The Genie++ Hierarchical Clustering Algorithm with Noise Points Detection

A retake on the Genie algorithm - a robust hierarchical clustering method (Gagolewski, Bartoszuk, Cena, 2016 ). Now faster and more memory efficient; determining the whole hierarchy for datasets of 10M points in low dimensional Euclidean spaces or 100K points in high-dimensional ones takes only 1-2 minutes. Allows clustering with respect to mutual reachability distances so that it can act as a noise point detector or a robustified version of 'HDBSCAN*' (that is able to detect a predefined number of clusters and hence it does not dependent on the somewhat fragile 'eps' parameter). The package also features an implementation of economic inequity indices (the Gini, Bonferroni index) and external cluster validity measures (partition similarity scores; e.g., the adjusted Rand, Fowlkes-Mallows, adjusted mutual information, pair sets index). See also the 'Python' version of 'genieclust' available on 'PyPI', which supports sparse data, more metrics, and even larger datasets.


News

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("genieclust")

0.9.4 by Marek Gagolewski, 10 days ago


https://genieclust.gagolewski.com/


Report a bug at https://github.com/gagolews/genieclust/issues


Browse source code at https://github.com/cran/genieclust


Authors: Marek Gagolewski [aut, cre, cph] , Maciej Bartoszuk [ctb] , Anna Cena [ctb] , Peter M. Larsen [ctb]


Documentation:   PDF Manual  


Task views: Cluster Analysis & Finite Mixture Models


AGPL-3 license


Imports Rcpp, stats, utils

Suggests datasets, emstreeR

Linking to Rcpp

System requirements: OpenMP, C++11


Depended on by genie.


See at CRAN