Offers a set of functions for extending 'dendrogram' objects in R, letting you visualize and compare trees of 'hierarchical clusterings'. You can (1) Adjust a tree's graphical parameters - the color, size, type, etc of its branches, nodes and labels. (2) Visually and statistically compare different 'dendrograms' to one another.
Class "dendrogram" provides general functions for handling tree-like structures in R. It is intended as a replacement for similar functions in hierarchical clustering and classification/regression trees, such that all of these can use the same engine for plotting or cutting trees.
However, many basic features are still missing from the dendrogram class. This package aims at filling in some gaps.
Extending R core dendrogram functions.
To install the stable version on CRAN:
install.packages('dendextend')install.packages('dendextendRcpp')
To install the GitHub version:
require2 <- function (package, ...) { if (!require(package)) install.packages(package); library(package)} ## require2('installr')## install.Rtools() # run this if you are using Windows and don't have Rtools installed # Load devtools:require2("devtools")devtools::install_github('talgalili/dendextend')require2("Rcpp")devtools::install_github('talgalili/dendextendRcpp') # Having colorspace is also useful, since it is used# In various examples in the vignettesrequire2("colorspace")
And then you may load the package using:
library(dendextend)library(dendextendRcpp)
Vignettes:
If you have made interesting work using the dendextend package, I would LOVE to know about it. It can be a blog post, an academic paper, or just some plots you made for your work in the industry. Please contact me with what you have done, and I would also be happy to promote it in this page.
You are welcome to:
You can see the most recent changes to the package in the NEWS.md file:
collapse_branch
sort.dendrogram
- added a new parameter: type = c("labels", "nodes"), to use ladderize
for sortingggplot.ggdend
- support theme = NULLas.dendrogram.phylo
and as.phylo.dendrogram
functions to it.labels
and labels<-
which_node
- finds Which node is common to a group of labelsdendrogram_data
(internal) function - a copy of the function from the ggdendro package (the basis for the new ggdend class).get_leaves_nodePar
- Get nodePar of dendrogram's leaves (designed to help with as.ggdend)as.ggdend.dendrogram
- turns a dendrogram to the ggdend class, ready to be plotted with ggplot2.prepare.ggdend
- fills a ggdend object with various default values (to be later used when plotted)ggplot.ggdend
- plots a ggdend with the ggplot2 engine (also the function theme_dendro
was imported from the ggdendro package).remove_nodes_nodePar
- as the name implies...collapse_branch
- simplifies a tree with branches lower than some tollerance levelladderize
- Ladderize a Tree (reorganizes the internal structure of the tree to get the ladderized effect when plotted)nleaves.phylo
- no longer require conversion to a dendrogram in order to compute.labels<-.dendrogram
- no longer forces as.character conversiontanglegram
- a new highlight_distinct_edges parameter (default is TRUE)cutree
- now produces warnings if it returns 0's. i.e.: when it can't cut the tree based on the required parameter. (following isue #5 reported by grafab)cutree_1h.dendrogram
and cutree_1k.dendrogram
- will now create clusters as the number of items if k==nleaves(tree) or if h<0. This is both consistent with stats::hclust, but it also "makes sense" (since this is well defined for ANY tree). Also updated the tests.get_branches_attr
to be get_root_branches_attr
get_nodes_attr
- added the "id" parameter (to get attributes of only a subset of id's)as.dendrogram.phylo
is properly exported now.order.dendrogram<-
- commenting off an examples and tests which (as of R 3.1.1-patched) produces an error (as it should). Thanks to Prof Brian Ripley for the e-mail about it.cor_bakers_gamma.Rd
:
\usage lines wider than 90 characterscompacted 'dendextend-tutorial.pdf' from 725Kb to 551Kb (doc fixes to pass CRAN checks) (Thanks to using the following:
tools::compactPDF("inst\\doc\\dendextend-tutorial.pdf",
qpdf = "C:\\Program Files (x86)\\qpdf-5.1.2\\bin\\qpdf.exe",
gs_cmd = "C:\\Program Files\\gs\\gs9.14\\bin\\gswin64c.exe",
gs_quality="ebook")
And to the help of Prof Brian Ripley and Kurt Hornik )
get_nodes_xy
- Get the x-y coordiantes of a dendrogram's nodesall_unique
- check if all elements in a vector are uniquehead.dendlist
rainbow_fun
- uses rainbow_hcl, or rainbow (if colorspace is not available)warn
paramteres are now set to dendextend_options("warn") (which is FALSE)!get_branches_attr
- change "warning" to "warn", and it now works with is.dendrogram, and no longer changes the class of something which is not a dendrogram.untangle_step_rotate_2side
- print_times is now dendextend_options("warn"),color_branches
- now handles flat trees more gracefully. (returns them as they are)cutree.dendrogram
- now replaces NA values with 0L (fix tests for it), added a parameter (NA_to_0L) to control it.Bk
- Have it work with cutree(NA_to_0L = FALSE)set.dendrogram
- added explenation in the .Rd docs of the different possible options for "what"set.dendrogram
- added nodes_pch, nodes_cex and nodes_col - using assign_values_to_nodes_nodePar
set.dendrogram
- changed from using labels_colors<-
to color_labels
for "labels_colors" (this will now work with using k...)set.dendrogram
- if "what" is missing, return the object as is.set.dendrogram
- added a "labels_to_char" option.labels_colors<-
- added if(dendextend_options("warn"))labels<-.dendrogram
- if value is missing, returning the dendrogram as is (this also affects set
)get_nodes_attr
- can now return an array or a list for attributes which include a more complex structure (such as nodePar), by working with lists and adding a "simplify" parameter.rect.dendrogram
- a new xpd and lower_rect parameters - to control how low the rect will be (for example, below or above the labels). The default is below the labels.colored_bars
- added defaults to make the bars be plotted bellow the labels. +allow the order of the bars to be based on the labels' order, made that to be the default +have scale default be better for multiple bars.branches_attr_by_labels
now uses dendextend_options("warn")
to decide if to print that labels were coerced into character.intersect_trees
- now returns a dendlist.untangle
- has a default to method (DendSet)untangle_step_rotate_1side
- added "leaves_matching_method" parameter.entanglement.dendrogram
- changed the default of "leaves_matching_method" to be "labels" (slower, but safer for the user...)branches_attr_by_clusters
and branches_attr_by_labels
- moved from using NA to Inf.color_branches
- can now work when the labels of the tree are not unique ("feature"" request by Heather Turner - thanks Heather :) )rect.dendrogram
- fix a bug with the location of the rect's (using "tree" and not "dend")rect.dendrogram
- Made sure the heights are working properly!colored_bars
- fix for multiple bars to work.assign_values_to_branches_edgePar
, assign_values_to_nodes_nodePar
, assign_values_to_leaves_nodePar
- now ignores "Inf" also when it is a character by adding as.numeric (and not only if it is numeric!) (this might be a problem if someone would try to update a label with the name "Inf").dendextend_options
functions to it.branches_attr_by_labels
between two files.assign_values_to_branches_edgePar
- make sure it deals with Inf and "Inf".as.dendrogram.pvclust
- extract the hclust from a pvclust object, and turns it into a dendrogram.hc2axes
- imported from pvclust, needed for text.pvclusttext.pvclust
- imported from pvclust, adds text to a dend plot of a pvclust resultpvclust_show_signif
- Shows the significant branches in a dendrogram, based on a pvclust objectpvclust_show_signif_gradient
- Shows the gradient of significance of branches in a dendrogram, based on a pvclust objectassign_values_to_leaves_nodePar
, assign_values_to_nodes_nodePar
, assign_values_to_branches_edgePar
- If the value has Inf
(instead of NA!) then the value will not be changed.assign_values_to_nodes_nodePar
- Assign values to nodePar of dendrogram's nodesassign_values_to_leaves_nodePar
- If the value has NA then the value in edgePar will not be changed.branches_attr_by_clusters
- This function was designed to enable the manipulation (mainly coloring) of branches, based on the results from the cutreeDynamic function (from the {dynamicTreeCut} package).which_leaf
- Which node is a leaf?na_locf
- Fill Last Observation Carried Forwardcolored_bars
- change the order of colors and dend, and allowing for dend to be missing. (also some other doc modifications)branches_attr_by_labels
- change the order of some parameters (based on how much I expect users to use each of them.)assign_values_to_branches_edgePar
- allow the option to skip leavesnoded_with_condition
- Find which nodes satisfies a conditionbranches_attr_by_labels
- Change col/lwd/lty of branches matching labels conditionrect.dendrogram
- adding paramters for creating text under the clusters,
as well as make it easier to plot lines on the rect (density = 7). props to skullkey for his help.set.dendrogram
- added new options: by_labels_branches_col, by_labels_branches_lwd, by_labels_branches_ltyorder.hclust
- Ordering of the Leaves in a hclust Dendrogramrect.dendrogram
- just like rect.hclust
, plus: works for dendrograms, passes ...
to rect for lwd lty etc, now has an horiz parameter!identify.dendrogram
- like identify.hclust
: reads the position of the graphics pointer when the (first) mouse button is pressed. It then cuts the tree at the vertical position of the pointer and highlights the cluster containing the horizontal position of the pointer. Optionally a function is applied to the index of data points contained in the cluster.rect.dendrogram.R
add
functions to be called set
. Reason: both are short names (important for chaining), both are not used in base R. "add" is used in magrittr (not good long term), and "set" sounds better English wise (we are setting labels color, more than adding it...).dendlist
- a function which creates a list of dendrogram of the new "dendlist" class.
tanglegram.dendlist
entanglement.dendlist
is.dendlist
- to check that an object is a dendlistas.dendlist
- to turn a list to a dendlistplot.dendlist
- it is basically a wrapper to tanglegram.click_rotate
- interactively rotate a tree (thanks to Andrej-Nikolai Spiess)untangle
- a master function to control all untangle functions (making it much easier to navigate this feature, as well as use it through %>% piping)untangle_DendSer
- a new untangle function (this time, only for dendlist), for leverging the serialization package for some more heuristics (based on the functions rotate_DendSer and DendSer.dendrogram).add.dendrogram
- a new master function to allow various updating of dendrogram objects. It includes options for: labels, labels_colors, labels_cex, branches_color, hang, leaves_pch, leaves_cex, leaves_col, branches_k_color, branches_col, branches_lwd, branches_lty, clear_branches, clear_leavesadd.dendlist
- a wrapper to add.dendrogram.colored_bars
- adding colored bars underneath a
dendrogram plot.untangle
functions will return a dendlist
(and also that untangle_step_rotate_2side will be able to work with the new untangle_step_rotate_1side output)labels_colors<-
- now has a default behavior if value is missing. Also made sure it is more robust (for cases with partiel attr in nodePar)color_branches
- now has a default behavior if k is missing.assign_values_to_branches_edgePar
- value can now be different than 1 (it now also has a recycle option for the value)is.dendrogram
more.tanglegram
- now preserve and restore previous par options (will no longer have a tiny plot in the left corner, when using a simple plot after tanglegram)tanglegram.dendlist
FILE CHANGES:
cor_bakers_gamma
sample.dendrogram
rank_order.dendrogram - for fixing leaves value order.
duplicate_leaf - for sample.dendrogram
sample.dendrogram - for bootstraping trees when the original data table is missing.
sort_dist_mat
cor_cophenetic
FILE CHANGES:
tanglegram - major addition!
plot_horiz.dendrogram - Plotting a left-tip-adjusted horizontal dendrogram
remove_leaves_nodePar
assign_values_to_branches_edgePar
remove_branches_edgePar
match_order_by_labels
match_order_dendrogram_by_old_order - like match_order_by_labels, but faster
entanglement
(including tests and documentation)
is.natural.number
cutree_1h.dendrogram - like cutree, but only for 1 height value.
fix_members_attr.dendrogram - just to validate that prune works o.k.
hang.dendrogram - hangs a dendrogram leaves (also allows for a rotated hanged dendrogram), works also for non-binary trees.
nnodes - count the number of nodes in a tree
as.dendrogram.phylo - based on as.hclust.
get_nodes_attr - allows easy access to attributes of branches and leaves
get_branches_heights
fix_members_attr.dendrogram
heights_per_k.dendrogram - get the heights for a tree that will yield each k cluster.
is.hclust
is.dendrogram
is.phylo
fac2num
as.phylo.dendrogram - based on as.hclust.
cutree_1k.dendrogram - like cutree, but only for 1 k (number of clusters) value.
cutree.dendrogram - like cutree but for dendrograms (and it is also vectorized)
cutree.hclust - like cutree but for hclust
cutree.phylo - like cutree but for phylo
sort_levels_values - make the resulting clusters from cutree to be ordered from left to right
cutree - with S3 methods for dendrogram/hclust/phylo
color_branches - color a tree branches based on its clusters. This is a modified version of the color_clusters function from jefferis's dendroextra package. It extends it by using my own version of cutree.dendrogram - allowing the function to work for trees that hclust can not handle (unrooted and non-ultrametric trees). Also, it allows REPEATED cluster color assignments to branches on to the same tree. Something which the original function was not able to handle. It also handles extreme cases, such as when the labels of a tree are integers.
color_labels - just like color_branches, but for labels.
assign_values_to_leaves_nodePar - allows for complex manipulation of dendrogram's leaves parameters.
* Includes skeletons for some functions that will be added in the future.