A Partial Clustering Algorithm

Provide the CrossClustering algorithm (Tellaroli et al. (2016) ), which is a partial clustering algorithm that combines the Ward's minimum variance and Complete Linkage algorithms, providing automatic estimation of a suitable number of clusters and identification of outlier elements.


CrossClustering

Travis BuildStatus AppVeyor BuildStatus CRAN StatusBadge CoverageStatus lifecycle

CrossClustering is a partial clustering algorithm that combines the Ward’s minimum variance and Complete Linkage algorithms, providing automatic estimation of a suitable number of clusters and identification of outlier elements.

Example

This is a basic example which shows you how to the main function, i.e. cc_crossclustering() works:

library(CrossClustering)
 
#### method = "complete"
data(toy)
 
### toy is transposed as we want to cluster samples (columns of the original
### matrix)
d <- dist(t(toy), method = "euclidean")
 
### Run CrossClustering
toyres <- cc_crossclustering(d, k_w_min = 2, k_w_max = 5, k2_max = 6,
out = TRUE)
toyres
#> 
#>     CrossClustering with method complete.
#> 
#> Parameter used:
#>   - Interval for the number of cluster of Ward's algorithm: [2, 5].
#>   - Interval for the number of cluster of the complete algorithm: [2, 6].
#>   - Outliers are considered.
#> 
#> Number of clusters found: 3.
#> Leading to an avarage silhouette width of: 0.8405.
#> 
#> A total of 6 elements clustered out of 7 elements considered.

Another useful function worth to mention is ari:

clusters <- iris[-5] %>%
 dist() %>%
 hclust(method = 'ward.D') %>%
 cutree(k = 3)
 
ground_truth <- iris[[5]] %>%
  as.numeric()
 
table(ground_truth, clusters) %>% 
  ari()
#>     Adjusted Rand Index (alpha = 0.05)
#> 
#> ARI                  = 0.76 (moderate recovery)
#> Confidence interval  = [0.74, 0.78]
#> 
#> p-values:
#>   * Qannari test     = < 0.001
#>   * Permutation test =   0.001

Install

CRAN version

CrossClustering package is on CRAN, use the standard method to install it. install_packages('CrossClustering')

develop version

To install the develop branch of CrossClastering package, use:

# install.packages(devtools)
devtools::install_github('CorradoLanera/CrossClustering', ref = 'develop')

Bug reports

If you encounter a bug, please file a reprex (minimal reproducible example) to https://github.com/CorradoLanera/CrossClustering/issues

References

Tellaroli P, Bazzi M., Donato M., Brazzale A. R., Draghici S. (2016). Cross-Clustering: A Partial Clustering Algorithm with Automatic Estimation of the Number of Clusters. PLoS ONE 11(3): e0152333. https://doi.org/10.1371/journal.pone.0152333

Tellaroli P, Bazzi M., Donato M., Brazzale A. R., Draghici S. (2017). E1829: Cross-Clustering: A Partial Clustering Algorithm with Automatic Estimation of the Number of Clusters. CMStatistics 2017, London 16-18 December, Book of Abstracts (ISBN 978-9963-2227-4-2)

News

CrossClustering 4.0.3

  • resubmission to CRAN
  • fix DESCRIPTION issues

CrossClustering 4.0.2

  • resubmission to CRAN
  • fix DESCRIPTION issues

CrossClustering 4.0.1

  • submit to CRAN
  • url fixed
  • spellcheck
  • update .travis.yml to fix an error in the macOS-devel build: warnings_are_errors: false for that build.

CrossClustering 3.3.02

  • Rversions 3.1 and 3.2 removed from Travis-CI
  • Reformat DESCRIPTION file

CrossClustering 3.3.01

  • Reference updated
  • Removed exported function cc_test_ari() and cc_test_ari_permutation() because now included in ari()
  • Adapted code and test to the new structures and conventions
  • Added dependencies for package dplyr
  • Changed and renamed cc_max_proportion() in consensus_cluster() as a costructor of object of class consensus_cluster
  • Created reverse_table() to come back from a contingency table to the unrolled vector of elements (issue #13)
  • Changes made in cc_get_clust() and cc_crossclustering() (issue #15)
  • Added examples for correlation (issue #14)
  • Changed and Renamed cc_ari_contingency() to ari as a costructor of objects of class ari (issue #12)
  • Added package cli into the dependencies
  • Update DESCRIPTION
  • Add Lifecycle badge
  • Add CRAN badge

CrossClustering 3.2.14

  • Added OSX on Travis-CI
  • Updated README
  • Minor style changes
  • Changed al unnecessary use of dot (.) to underscore (_)

CrossClustering 3.1.42

  • Renamed which_cluster() to cc_get_cluster()
  • Renamed SignificanceARI() to cc_test_ari()
  • Renamed PermSignificanceARI() to cc_test_ari_permutation()
  • Renamed max_proportion_function() to cc_max_proportion()
  • Renamed CrossClustering() to cc_crossclustering()
  • Renamed ARI_contingency() to cc_ari_contingency()

CrossClustering 3.1.35

  • Added test for ARI_contingency() as requestedi in issue-#7

CrossClustering 3.1.34

  • Added examples to main functions
  • Adopted a verb-like style for the function names
  • Added data/, data-raw/ and R/data.R to include example data into the package.
  • Adopted snake_case for funciton and variable names
  • Added functions: ARI_contingency(), PermSignificanceARI() and SignificanceARI().
  • Added support for complete and single method to CrossClustering()
  • Removed all the calls to : in favor of seq_*()
  • Removed all the calls to require() or library()
  • Removed all the calls to sapply()
  • Substituted geneinlista() with which_cluster()
  • Reshaped directory tree
  • Added dependencies.R to track imported dependencies
  • Added utils-pip.R to support pipe operator
  • Added utils.R for utility functions
  • Restyled all the code
  • Adde tests modulus for all the funcitons
  • Added support for Travis, Appveyor and Codecov CI
  • Added a NEWS.md file to track changes to the package.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("CrossClustering")

4.0.3 by Paola Tellaroli, 9 months ago


https://CRAN.R-project.org/package=CrossClustering


Report a bug at https://github.com/CorradoLanera/CrossClustering/issues


Browse source code at https://github.com/cran/CrossClustering


Authors: Paola Tellaroli [cre, aut] , Marco Bazzi [aut] , Michele Donato [aut] , Livio Finos [aut] , Philippe Courcoux [aut] , Corrado Lanera [aut]


Documentation:   PDF Manual  


GPL (>= 3) license


Imports cluster, mclust, flip, magrittr, purrr, utils, assertive, crayon, glue, cli, dplyr

Suggests testthat, covr, roxygen2


See at CRAN