Cluster Strings by Edit-Distance

Returns an edit-distance based clusterization of an input vector of strings. Each cluster will contain a set of strings w/ small mutual edit-distance (e.g., Levenshtein, optimum-sequence-alignment, Damerau-Levenshtein), as computed by stringdist::stringdist(). The set of all mutual edit-distances is then used by graph algorithms (from package 'igraph') to single out subsets of high connectivity.


News

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("clustringr")

1.0 by Dan S. Reznik, 19 days ago


Browse source code at https://github.com/cran/clustringr


Authors: Dan S. Reznik


Documentation:   PDF Manual  


MIT + file LICENSE license


Imports magrittr, dplyr, stringi, stringr, stringdist, igraph, assertthat, forcats, rlang, tidygraph, ggraph, ggplot2


See at CRAN