Extending 'Dendrogram' Functionality in R

Offers a set of functions for extending 'dendrogram' objects in R, letting you visualize and compare trees of 'hierarchical clusterings'. You can (1) Adjust a tree's graphical parameters - the color, size, type, etc of its branches, nodes and labels. (2) Visually and statistically compare different 'dendrograms' to one another.


Class "dendrogram" provides general functions for handling tree-like structures in R. It is intended as a replacement for similar functions in hierarchical clustering and classification/regression trees, such that all of these can use the same engine for plotting or cutting trees.

However, many basic features are still missing from the dendrogram class. This package aims at filling in some gaps.

Extending R core dendrogram functions.

To install the stable version on CRAN:

install.packages('dendextend')
install.packages('dendextendRcpp')

To install the GitHub version:

require2 <- function (package, ...) {
    if (!require(package)) install.packages(package); library(package)
}
 
## require2('installr')
## install.Rtools() # run this if you are using Windows and don't have Rtools installed
 
# Load devtools:
require2("devtools")
devtools::install_github('talgalili/dendextend')
require2("Rcpp")
devtools::install_github('talgalili/dendextendRcpp')
 
# Having colorspace is also useful, since it is used
# In various examples in the vignettes
require2("colorspace")

And then you may load the package using:

library(dendextend)
library(dendextendRcpp)

Vignettes:

  • http://htmlpreview.github.io/?https://github.com/talgalili/dendextend/blob/master/inst/ignored/introduction.html
  • https://github.com/talgalili/dendextend/blob/master/inst/doc/dendextend-tutorial.pdf (older)

If you have made interesting work using the dendextend package, I would LOVE to know about it. It can be a blog post, an academic paper, or just some plots you made for your work in the industry. Please contact me with what you have done, and I would also be happy to promote it in this page.

You are welcome to:

You can see the most recent changes to the package in the NEWS.md file:

  • https://github.com/talgalili/dendextend/blob/master/NEWS.md

News

###OTHER NOTES: (Thanks to Prof Brian Ripley for help.)

  • Fix "title case" for the package's Title in the DESCRIPTION
  • Fix "No package encoding and non-ASCII characters in the following R files"
  • Fix "Please use :: or requireNamespace() instead." by commenting out all "library", since using "::" is enough!
  • dendextend 0.18.3 is intended to be shipped to CRAN.

###OTHER NOTES:

  • Minor doc fix for collapse_branch
  • dendextend 0.18.2 is intended to be shipped to CRAN.

###VIGNETTE - new sections!

  • Quick functions for FAQ
    • How to colour the labels of a dendrogram by an additional factor variable
    • How to color a dendrogram's branches/labels based on cluster (i.e.: cutree result)
    • Change dendrogram's labels
    • Larger font for leaves in a dendrogram
    • How to view attributes of a dendrogram
  • ggplot2 integration (!)
  • Comparing trees:
    • dend_diff
    • all.equal
    • dist.dendlist
    • cor.dendlist
  • Others
    • rotate - explain about sort
    • collapse_branch

###UPDATED FUNCTIONS:

  • sort.dendrogram - added a new parameter: type = c("labels", "nodes"), to use ladderize for sorting
  • ggplot.ggdend - support theme = NULL

###OTHER NOTES:

  • dendextend 0.18.1 is intended to be shipped to CRAN.

###NEW FILES:

  • ape.R - moved as.dendrogram.phylo and as.phylo.dendrogram functions to it.
  • cor.dendlist.R - For the cor.dendlist function
  • renamed imports_stats.R to stats_imports.R
  • ggdendro.R
  • ggdend.R
  • dist_long.R

###NEW FUNCTIONS:

  • More connections:
    • A new phylo method for labels and labels<-
  • More ways to compare trees:
    • cor.dendlist - Correlation matrix between a list of trees.
    • partition_leaves - A list with labels for each subtree (edge)
    • distinct_edges - Finds the edges present in the first tree but not in the second
    • highlight_distinct_edges - Highlight distint edges in a tree (compared to another one). Works for both dendrogram and dendlist.
    • dend_diff - Plots two trees side by side, highlighting edges unique to each tree in red. Works for both dendrogram and dendlist.
    • dist.dendlist - Topological Distances Between Two dendrograms (currently only the Robinson-Foulds distance)
    • all.equal.dendrogram/all.equal.dendlist - Global Comparison of two (or more) dendrograms
    • which_node - finds Which node is common to a group of labels
  • Dendrograms in ggplot2! (enhancing the ggdendro package)
    • dendrogram_data (internal) function - a copy of the function from the ggdendro package (the basis for the new ggdend class).
    • get_leaves_nodePar - Get nodePar of dendrogram's leaves (designed to help with as.ggdend)
    • as.ggdend.dendrogram - turns a dendrogram to the ggdend class, ready to be plotted with ggplot2.
    • prepare.ggdend - fills a ggdend object with various default values (to be later used when plotted)
    • ggplot.ggdend - plots a ggdend with the ggplot2 engine (also the function theme_dendro was imported from the ggdendro package).
  • Others:
    • remove_nodes_nodePar - as the name implies...
    • collapse_branch - simplifies a tree with branches lower than some tollerance level
    • ladderize - Ladderize a Tree (reorganizes the internal structure of the tree to get the ladderized effect when plotted)

###NEW TESTS:

  • partition_leaves
  • distinct_edges
  • dend_diff
  • dist.dendlist

###UPDATED FUNCTIONS:

  • nleaves.phylo - no longer require conversion to a dendrogram in order to compute.
  • labels<-.dendrogram - no longer forces as.character conversion
  • tanglegram - a new highlight_distinct_edges parameter (default is TRUE)
  • cutree - now produces warnings if it returns 0's. i.e.: when it can't cut the tree based on the required parameter. (following isue #5 reported by grafab)
  • cutree_1h.dendrogram and cutree_1k.dendrogram - will now create clusters as the number of items if k==nleaves(tree) or if h<0. This is both consistent with stats::hclust, but it also "makes sense" (since this is well defined for ANY tree). Also updated the tests.
  • Rename get_branches_attr to be get_root_branches_attr
  • get_nodes_attr - added the "id" parameter (to get attributes of only a subset of id's)
  • as.dendrogram.phylo is properly exported now.

###NEW FUNCTIONS:

  • dist_long - Turns a dist object to a "long" table.

###UPDATED FUNCTIONS:

  • order.dendrogram<- - commenting off an examples and tests which (as of R 3.1.1-patched) produces an error (as it should). Thanks to Prof Brian Ripley for the e-mail about it.

###BUG FIXES:

  • checking S3 generic/method consistency ... WARNING cor_bakers_gamma: function(tree1, tree2, use_labels_not_values, to_plot, warn, ...) cor_bakers_gamma.dendlist: function(tree1, which, ...)
  • Undocumented code objects: 'plot.dendlist'
  • cor_bakers_gamma.Rd: \usage lines wider than 90 characters

###VIGNETTE:

  • Fixed several typos and grammatical mistakes.

###OTHER NOTES:

  • dendextend 0.17.5 is intended to be shipped to CRAN (to stay compatible with R 3.1.1-patched).

###NEW FUNCTIONS:

  • set.data.table - informs the user of the conflict in "set" between dendextend and data.table.

###VIGNETTE:

  • added sessionInfo

###UPDATED FUNCTIONS:

  • color_labels - now can handle the coloring of labels when the function is without k and h, but that it is not possible to cut the tree to nleaves items (due to several leaves with 0 height). This is done by not doing any cutting in such cases, and just directly using labels_colors. Tests are added. Bug report by Marina Varfolomeeva, a.k.a varmara - thanks! (https://github.com/talgalili/dendextend/issues/3 )

###NEW FUNCTIONS:

  • cor_bakers_gamma.dendlist
  • assign_values_to_leaves_edgePar (noticed the need from this question: http://stackoverflow.com/questions/23328663/color-branches-of-dendrogram-using-an-existing-column?rq=1)

###UPDATED FUNCTIONS:

  • assign_values_to_leaves_nodePar - added if(warn), if value is missing.

###VIGNETTE:

  • Fix some typos and mistakes.
  • Add to introduction.Rmd how to install the package from github.

###OTHER NOTES:

  • compacted 'dendextend-tutorial.pdf' from 725Kb to 551Kb (doc fixes to pass CRAN checks) (Thanks to using the following:

         tools::compactPDF("inst\\doc\\dendextend-tutorial.pdf", 
                           qpdf = "C:\\Program Files (x86)\\qpdf-5.1.2\\bin\\qpdf.exe", 
                           gs_cmd = "C:\\Program Files\\gs\\gs9.14\\bin\\gswin64c.exe",
                           gs_quality="ebook") 
    

    And to the help of Prof Brian Ripley and Kurt Hornik )

###VIGNETTE:

  • Wrote a new vignette "introduction.Rmd", to showcase the new functions since the last vignette, and give a quick-as-possible introduction to the package functions.

###NEW FUNCTIONS:

  • get_nodes_xy - Get the x-y coordiantes of a dendrogram's nodes
  • all_unique - check if all elements in a vector are unique
  • head.dendlist
  • rainbow_fun - uses rainbow_hcl, or rainbow (if colorspace is not available)

###UPDATED FUNCTIONS:

  • ALL warn paramteres are now set to dendextend_options("warn") (which is FALSE)!
  • get_branches_attr - change "warning" to "warn", and it now works with is.dendrogram, and no longer changes the class of something which is not a dendrogram.
  • untangle_step_rotate_2side - print_times is now dendextend_options("warn"),
  • color_branches - now handles flat trees more gracefully. (returns them as they are)
  • cutree.dendrogram - now replaces NA values with 0L (fix tests for it), added a parameter (NA_to_0L) to control it.
  • Bk - Have it work with cutree(NA_to_0L = FALSE)
  • set.dendrogram - added explenation in the .Rd docs of the different possible options for "what"
  • set.dendrogram - added nodes_pch, nodes_cex and nodes_col - using assign_values_to_nodes_nodePar
  • set.dendrogram - changed from using labels_colors<- to color_labels for "labels_colors" (this will now work with using k...)
  • set.dendrogram - if "what" is missing, return the object as is.
  • set.dendrogram - added a "labels_to_char" option.
  • labels_colors<- - added if(dendextend_options("warn"))
  • labels<-.dendrogram - if value is missing, returning the dendrogram as is (this also affects set)
  • get_nodes_attr - can now return an array or a list for attributes which include a more complex structure (such as nodePar), by working with lists and adding a "simplify" parameter.
  • rect.dendrogram - a new xpd and lower_rect parameters - to control how low the rect will be (for example, below or above the labels). The default is below the labels.
  • colored_bars - added defaults to make the bars be plotted bellow the labels. +allow the order of the bars to be based on the labels' order, made that to be the default +have scale default be better for multiple bars.
  • branches_attr_by_labels now uses dendextend_options("warn") to decide if to print that labels were coerced into character.
  • intersect_trees - now returns a dendlist.
  • untangle - has a default to method (DendSet)
  • untangle_step_rotate_1side - added "leaves_matching_method" parameter.
  • entanglement.dendrogram - changed the default of "leaves_matching_method" to be "labels" (slower, but safer for the user...)

###BUG FIXES:

  • branches_attr_by_clusters and branches_attr_by_labels - moved from using NA to Inf.
  • color_branches - can now work when the labels of the tree are not unique ("feature"" request by Heather Turner - thanks Heather :) )
  • rect.dendrogram - fix a bug with the location of the rect's (using "tree" and not "dend")
  • rect.dendrogram - Made sure the heights are working properly!
  • colored_bars - fix for multiple bars to work.
  • assign_values_to_branches_edgePar, assign_values_to_nodes_nodePar, assign_values_to_leaves_nodePar - now ignores "Inf" also when it is a character by adding as.numeric (and not only if it is numeric!) (this might be a problem if someone would try to update a label with the name "Inf").

###NEW FILES:

  • dendextend_options.R - moved dendextend_options functions to it.
  • get_nodes_xy.R
  • Rename files: trim.R -> prune.R
  • DendSer.R
  • Move the function branches_attr_by_labels between two files.

###NEW TESTS:

  • assign_values_to_branches_edgePar - make sure it deals with Inf and "Inf".

###OTHER NOTES:

  • Moved ggdendro,labeltodendro,dendroextras,ape to "Enhances:" in DESCRIPTION.
  • Moved dendextend-tutorial.rnw to vignettes\disabled - so it is still there, but not compiled.
  • Moved dendextend-tutorial.pdf to inst\doc - so there is a copy of this older vignette, but without needed to run it with all the benchmarks... (it is also compressed)
  • Created a copy of "introduction.html" in inst/ignored (so people could see it on github)
  • Have the package build the vignette.

###NEW FUNCTIONS:

  • as.dendrogram.pvclust - extract the hclust from a pvclust object, and turns it into a dendrogram.
  • hc2axes - imported from pvclust, needed for text.pvclust
  • text.pvclust - imported from pvclust, adds text to a dend plot of a pvclust result
  • pvclust_show_signif - Shows the significant branches in a dendrogram, based on a pvclust object
  • pvclust_show_signif_gradient - Shows the gradient of significance of branches in a dendrogram, based on a pvclust object

###UPDATED FUNCTIONS:

  • assign_values_to_leaves_nodePar, assign_values_to_nodes_nodePar, assign_values_to_branches_edgePar - If the value has Inf (instead of NA!) then the value will not be changed.

###NEW FUNCTIONS:

  • assign_values_to_nodes_nodePar - Assign values to nodePar of dendrogram's nodes

###UPDATED FUNCTIONS:

  • assign_values_to_leaves_nodePar - If the value has NA then the value in edgePar will not be changed.

###OTHER NOTES:

  • NEWS - updated to use header 2 and 3 instead of 1 and 2 for the markdown version.

###OTHER NOTES:

  • require -> library (Thanks Yihui: http://yihui.name/en/2014/07/library-vs-require/)

###OTHER NOTES:

  • Minor doc fixes to pass CRAN checks.

###NEW FUNCTIONS:

  • branches_attr_by_clusters - This function was designed to enable the manipulation (mainly coloring) of branches, based on the results from the cutreeDynamic function (from the {dynamicTreeCut} package).
  • which_leaf - Which node is a leaf?
  • na_locf - Fill Last Observation Carried Forward

###UPDATED FUNCTIONS:

  • `assign_values_to_branches_edgePar - now can keep existing value, if gets NA.
  • colored_bars - change the order of colors and dend, and allowing for dend to be missing. (also some other doc modifications)
  • branches_attr_by_labels - change the order of some parameters (based on how much I expect users to use each of them.)
  • assign_values_to_branches_edgePar - allow the option to skip leaves

###NEW FILES:

  • branches_attr_by.R - for branches_attr_by_clusters

###OTHER NOTES:

  • added a pvclust example (using a condition on p-value, and heighlighting branches based on that with lwd/col.)

###NEW FUNCTIONS:

  • noded_with_condition - Find which nodes satisfies a condition
  • branches_attr_by_labels - Change col/lwd/lty of branches matching labels condition

###UPDATED FUNCTIONS:

  • rect.dendrogram - adding paramters for creating text under the clusters, as well as make it easier to plot lines on the rect (density = 7). props to skullkey for his help.
  • set.dendrogram - added new options: by_labels_branches_col, by_labels_branches_lwd, by_labels_branches_lty

###NEW FILES:

  • noded_with_condition.R

###NEW FUNCTIONS:

  • order.hclust - Ordering of the Leaves in a hclust Dendrogram
  • rect.dendrogram - just like rect.hclust, plus: works for dendrograms, passes ... to rect for lwd lty etc, now has an horiz parameter!
  • identify.dendrogram - like identify.hclust: reads the position of the graphics pointer when the (first) mouse button is pressed. It then cuts the tree at the vertical position of the pointer and highlights the cluster containing the horizontal position of the pointer. Optionally a function is applied to the index of data points contained in the cluster.

###NEW FILES: rect.dendrogram.R

###OTHER NOTES:

  • Rename the add functions to be called set. Reason: both are short names (important for chaining), both are not used in base R. "add" is used in magrittr (not good long term), and "set" sounds better English wise (we are setting labels color, more than adding it...).
  • Rename 2 file names from add->set (set.dendrogram.R and tests-set.dendrogram.R)

###NEW FUNCTIONS:

  • dendlist - a function which creates a list of dendrogram of the new "dendlist" class.
    • tanglegram.dendlist
    • entanglement.dendlist
    • is.dendlist - to check that an object is a dendlist
    • as.dendlist - to turn a list to a dendlist
    • plot.dendlist - it is basically a wrapper to tanglegram.
  • click_rotate - interactively rotate a tree (thanks to Andrej-Nikolai Spiess)
  • untangle - a master function to control all untangle functions (making it much easier to navigate this feature, as well as use it through %>% piping)
  • untangle_DendSer - a new untangle function (this time, only for dendlist), for leverging the serialization package for some more heuristics (based on the functions rotate_DendSer and DendSer.dendrogram).
  • add.dendrogram - a new master function to allow various updating of dendrogram objects. It includes options for: labels, labels_colors, labels_cex, branches_color, hang, leaves_pch, leaves_cex, leaves_col, branches_k_color, branches_col, branches_lwd, branches_lty, clear_branches, clear_leaves
  • add.dendlist - a wrapper to add.dendrogram.
  • colored_bars - adding colored bars underneath a dendrogram plot.

###UPDATED FUNCTIONS:

  • Made sure that the main untangle functions will return a dendlist (and also that untangle_step_rotate_2side will be able to work with the new untangle_step_rotate_1side output)
  • switched to using match.arg wherever possible (Bk_plot, cor_cophenetic, entanglement, untangle_random_search, untangle_step_rotate_1side, and untangle_step_rotate_2side).
  • labels_colors<- - now has a default behavior if value is missing. Also made sure it is more robust (for cases with partiel attr in nodePar)
  • color_branches - now has a default behavior if k is missing.
  • assign_values_to_branches_edgePar - value can now be different than 1 (it now also has a recycle option for the value)
  • Generally - moved to using is.dendrogram more.
  • tanglegram - now preserve and restore previous par options (will no longer have a tiny plot in the left corner, when using a simple plot after tanglegram)

###NEW S3 METHODS:

  • tanglegram.dendlist

###NEW FILES:

  • dendlist.R
  • test-dendlist.R
  • test-add.dendrogram.R
  • add.dendrogram.R
  • colored_bars.R
  • magrittr.R

###UPDATED TESTS:

  • Check dendlist works

###OTHER NOTES:

  • DESCRIPTION -
    • Added the magrittr package as a Depends.
    • changed stats from depends to imports. Here is a good reference for why to choose the one over the other - http://stackoverflow.com/questions/8637993/better-explanation-of-when-to-use-imports-depends And: http://stackoverflow.com/questions/6895852/load-a-package-only-when-needed-in-r-package
  • Fix errors and typos in vignettes - thank you Bob Muenchen!
  • Fix the docs of the functions in dendextend which relates to the newer dendextendRcpp (version 0.5.1): cut_lower_fun, get_branches_heights, heights_per_k.dendrogram
  • tests - Moved from using test_that with equal() to test_equal (due to some conflict with, possibly, devtools)
  • roxygen2 - Moved from using @S3method to @export (removed 45 warnings from check() )
  • Moved all "@import" to the dendextend-package.R file (just to make it easier to follow up on them). This code makes sure that thees packages will be mentioned in the NAMESPACE file.
  • Imported the %>% function from magrittr (using a trick from the dplyr package)

###OTHER NOTES:

  • Changed all R script files from .r to .R!

###UPDATED DESCRIPTION:

  • Fix an author name.
  • Added dendextendRcpp to suggest

###OTHER NOTES:

  • Minor changes to docs.

###UPDATED DESCRIPTION:

  • Added dependency for R (>= 3.0.0)

###OTHER NOTES:

  • dendextend 0.14.2 is intended to be shipped to CRAN.

###UPDATED DESCRIPTION:

  • Added Uwe and Kurt as contributors.
  • Removed Suggests: dendextendRcpp, (until it would be on CRAN)
  • Removed link to google group

###NEW FUNCTIONS:

  • dendextend_options (actually an enviornment + a function). Here I've moved the dendextend_options from the global enviornment to the dendextend namespace.

###UPDATED TESTS:

  • update test_rotate.r so it would make sure ape is loaded BEFORE dendextend.

###OTHER NOTES:

  • dendextend 0.14.1 goes with Version 0.5.0 of dendextendRcpp. Previous versions of dendextendRcpp will not be effective for versions of dendextend which are before 0.14.0.
  • dendextend 0.14.1 is intended to be shipped to CRAN.

###UPDATED FUNCTIONS:

  • assign_dendextend_options - Moved to passing the functions through "dendextend_options" instead of through "options" (Thanks to suggestions by Kurt Hornik and Uwe Ligges).
  • assign_dendextend_options - is now exported.
  • remove_dendextend_options - now removes the object dendextend_options
  • get_branches_heights, heights_per_k.dendrogram, cut_lower_fun - now all rely on dendextend_options.

###UPDATED TESTS:

  • update tests to the new names in dendextendRcpp (dendextendRcpp_cut_lower_fun, dendextend_options)

###UPDATED FUNCTIONS:

  • assign_dendextend_options - Moved to passing the functions through "options" instead of through assignInNamespace (which was not intended for production use).
  • get_branches_heights, heights_per_k.dendrogram, cut_lower_fun - now all rely on the function located in the global options. This way, they can be replaced by the dendextebdRcpp version, if available.

###UPDATED TESTS:

  • When comparing to dendextendRcpp - added condition to not make the check if the package is not loaded and in the search path (this way I could compare the tests with and without the dendextendRcpp package).
  • added a minor test for dendextend_get_branches_heights - checking the function directly through the options.

###UPDATED DOCS:

  • dendextend_get_branches_heights, dendextend_heights_per_k.dendrogram, dendextend_cut_lower_fun - gave speed tests

###NEW FUNCTIONS:

  • assign_dendextend_options - we now pass all functions that have a Rcpp equivalent through "options". While this adds a bit of an overhead (sadly), it still gets a much faster speed gain, and without verious warnings that CRAN checks would not like...
  • dendextend_get_branches_heights, dendextend_heights_per_k.dendrogram, dendextend_cut_lower_fun

###OTHER NOTES:

  • dendextend 0.13.0 goes with Version 0.4.0 of dendextendRcpp. Previous versions of dendextendRcpp will not be effective for versions of dendextend which are before 0.13.0 (however, it would also not conflict with them...)
  • dendextend 0.13.0 is intended to be shipped to CRAN.

###UPDATED DESCRIPTION:

  • Removed VignetteBuilder: knitr (until later)
  • Removed Suggests: dendextendRcpp, (until later)
  • fixed mis-spelled words: extanding (14:40)

###NEW FUNCTIONS:

  • Hidden "stats" functions have been added to a new file "imports_stats.r" with a new local copy for 'stats:::.memberDend' 'stats:::.midDend' 'stats:::midcache.dendrogram' 'stats:::plotNode' 'stats:::plotNodeLimit' with stats::: -> stats_

###UPDATED FUNCTIONS:

  • stats:::cutree -> stats::cutree
  • dendextend:::cutree -> dendextend::cutree

###OTHER NOTES:

  • compacted ‘dendextend-tutorial.pdf’ from 961Kb to 737Kb (thanks to tools::compactPDF)
  • dendextend 0.12.2 is intended to be shipped to CRAN.

###UPDATED TESTS:

  • Made sure to check dendextendRcpp is available before calling it.

###UPDATED DOCS:

  • data(iris) -> data(iris, envir = environment())
  • Fix "\examples lines wider than 100 characters:" in several places.

###OTHER NOTES:

  • Commented out manipulations on the search path and of assignInNamespace (to avoid NOTES/warnings). This was done after moving all of these operations into Rcpp.
  • dendextend 0.12.1 is intended to be shipped to CRAN. (but failed)

###UPDATED FUNCTIONS:

  • exported prune_leaf
  • as.dendrogram.phylo as.phylo.dendrogram - turned into S3 (no longer exported)
  • changed functions names:
    • trim -> prune
    • unroot -> unbranch
  • Moved from ::: to :: (where possible).
  • tanglegram.dendrogram - fix warning in layout(matrix(1:3, nrow = 1), width = columns_width): partial argument match of 'width' to 'widths'
  • Return "as.phylo.dendrogram" by adding "ape" to "Imports" in DESCRIPTION and "import" to NAMESPACE. Also fixing consistancy (using x instead of object).
  • unbranch.phylo - fix extra parameters.

###UPDATED DOCS:

  • leaf_Colors - fix example (added "dend").
  • Fix various "Missing link or links in documentation object" for example:
    • remove \link{untangle} from various .Rd (I never created this function...)
    • tangelgram -> tanglegram
  • fix "Unknown package 'dendroextra' in Rd xrefs" in color_branches docs. (into dendroextras)
  • Fix Undocumented code objects: 'old_cut_lower_fun' 'old_get_branches_heights' 'old_heights_per_k.dendrogram'. By adding them as "#' @aliases"
  • Fix "Codoc mismatches from documentation object"
    • 'rotate': - by removing k and h (since I never got to implement them...)
  • Fix "Mismatches in argument default values"
    • tanglegram
      • Name: 'margin_inner' Code: 3 Docs: 1.8
      • Name: 'lab.cex' Code: NULL Docs: 1
      • Name: 'remove_nodePar' Code: FALSE Docs: F
  • Fix "Argument names in code not in docs", for: edge.lwd dLeaf_left dLeaf_right main sub rank_branches hang match_order_by_labels cex_main cex_main_left cex_main_right cex_sub
  • Fix "'library' or 'require' call not declared from: 'ape'" by commenting-off every "require(ape)" command in the code, since it is already mentioned in imports! (see: http://stackoverflow.com/questions/15648772/how-do-i-prevent-r-library-or-require-calls-not-declared-warnings-when-dev) The problem still persists because of .onLoad in zzz.r, but we'll look into this later...
  • Fix "Undocumented arguments in documentation object" for:
    • 'bakers_gamma_for_2_k_matrix'
    • 'cor_bakers_gamma'
    • 'cut_lower_fun'
  • Fix "Objects in \usage without \alias in documentation object 'shuffle'": 'shuffle.dendrogram' 'shuffle.hclust' 'shuffle.phylo'
  • Fix "Argument items with no description in Rd object": 'plot_horiz.dendrogram', 'untangle_step_rotate_1side'.
  • rotate - Remove the "flip" command in the example (after I noticed that "rev" does this just fine...)

###UPDATED TESTS:

  • S3methods no longer seem to be exported (due to something in roxygen2), I chose to update the tests accordingly.
    • cutree.hclust -> dendextend:::cutree.hclust
    • cutree.dendrogram -> dendextend:::cutree.dendrogram
  • cut_lower_fun acts diffirently on dendextendRcpp vs old dendextend, so I updated the tests to reflect that.
  • Fixed the usage of person() in DESCRIPTION. (props goes to Uwe Ligges for his input)

###OTHER NOTES:

  • Fixing .Rd indentation.
  • Fix S3method in NAMESPACE.
  • Added "ape::" to as.phylo.
  • Added to .Rbuildignore: (large files which are not essential)
    • inst/doc/2013-09-05_Boston-useR
    • vignettes/figure
    • vignettes/ (we'll deal with this later...)
  • Removed "Enhances: ape" from DESCRIPTION
  • README.md - Using 'talgalili/dendextend', in install_github

###UPDATED FUNCTIONS:

  • tanglegram now has "sub" and "cex_sub" parameters.
  • untangle_step_rotate_2side added k_seq parameter.
  • "trim" is now called "prune"!

###VIGNETTES:

  • Finished tanglegram and untangle.
  • Finished statistical measures of similarity between trees.

###UPDATED FUNCTIONS:

  • color_labels - added a "warn" parameter. And also set the default (in case no k/h is supplied) - to just color all of the labels.
  • Added "warn" parameter to: assign_values_to_leaves_nodePar, And set it to FALSE when used inside "tanglegram".
  • tanglegram now returns an invisible list with the two dendrograms (after they have been modified within the function).

###BUG FIXES:

  • untangle_random_search - made sure the function will return the original trees if no better tree was found.

###OTHER NOTES:

  • Seperated 2013-09-05_Boston-useR.Rpres into two files (since RStudio is not able to handle them)

###VIGNETTES:

  • Added a knitr presentation for "Boston-useR" 2013-09-05. Includes an introduction to hclust and dendrogram objects, tree manipulation, and dendextend modules (still needs the dendextend section on tanglegram...)

###UPDATED FUNCTIONS:

  • tanglegram - added cex_main parameter.

###OTHER NOTES:

  • Gave proper credit to contributers in the DESCRIPTION file (and not just the .Rd files)

###NEW FUNCTIONS ADDED:

  • cut_lower_fun - it wraps the "cut" function, and is built to be masked by the function in dendextendRcpp in order to gain 4-14 speed gain.

###NEW TESTS ADDED:

  • For Bk methods.

###OTHER NOTES:

  • The dendextendRcpp package (version 0.3.0) is now on github, and offers functions for making cutree.dendrogram(h) faster (between 4 to 14 times faster).

###VIGNETTES:

  • Added cut_lower_fun to the Rcpp section.
  • Added FM-index and Bk plot sections.

###NEW FUNCTIONS ADDED:

  • cor_bakers_gamma.hclust

###UPDATED FUNCTIONS:

  • cutree.hclust - added the "use_labels_not_values" paremter (ignored)

###UPDATED FUNCTIONS:

  • color_labels - added "labels" parameter for selective coloring of labels by name.
  • Bk_plot - now adds dots for the asymptotic lines in case of NA's
  • Bk - now calculates cutree once for all relevant k's - and only then goes forth with FM_index.

###BUG FIXES:

  • FM_index_R - now returns NA when comparing NA vectors (when, for example, there is no possible split for some k), instead of crashing (as it did before).
  • Bk_plot - now won't turn one dendrogram into hclust, while leaving the other a dendrogram.

###OTHER NOTES:

  • The dendextendRcpp package (version 0.2.0) is now on github, and offers functions for making cutree.dendrogram(k) MUCH faster (between 20 to 100 times faster). (this is besided having labels.dendrogram now also accept a leaf as a tree.)

###VIGNETTES:

  • Added Rcpp section.
  • Started the Bk section (some theory, but no code yet - although it is all written by now...).

###NEW FUNCTIONS ADDED:

  • sort_2_clusters_vectors
  • FM_index_profdpm
  • FM_index_R
  • FM_index
  • FM_index_permutation - for checking permutation distribution of the FM Index
  • Bk
  • Bk_permutations
  • Bk_plot (it can be MUCH slower for dendrograms with large trees, but works great for hclust)

###UPDATED FUNCTIONS:

  • color_labels - removed unused 'groupLabels' parameter.

###VIGNETTES:

  • Added the FM Index section.

FILE CHANGES:

  • Bk-method.r file added.

###OTHER NOTES:

  • The dendextendRcpp package (version 0.1.1) is now on github, and offers a faster labels.dendrogram function (It is 20 to 40 times faster than the 'stats' function!)
  • Added a commented-out section which could (in the future) be the basis of an Rcpp cutree (actually cutree_1h.dendrogram) function!

###NEW FUNCTIONS ADDED:

  • cor_bakers_gamma

  • sample.dendrogram

  • rank_order.dendrogram - for fixing leaves value order.

  • duplicate_leaf - for sample.dendrogram

  • sample.dendrogram - for bootstraping trees when the original data table is missing.

  • sort_dist_mat

  • cor_cophenetic

###UPDATED FUNCTIONS:

  • tanglegram - added the match_order_by_labels parameter.

###VIGNETTES:

  • Added the Baker's Gamma Index section.
  • Added a bootstrap and permutation examples for inference on Baker's Gamma.
  • Also for Cophenetic correlation.

FILE CHANGES:

  • sample.dendrogram.r file added.

###BUG FIXES:

  • fix_members_attr.dendrogram - fixed a bug introduced by the new "members" method in nleaves. (test added)

###NEW FUNCTIONS ADDED:

  • get_childrens_heights - Get height attributes from a dendrogram's children
  • rank_branches - ranks the heights of branches - making comparison of the topologies of two trees easier.

###UPDATED FUNCTIONS:

  • sort_levels_values - now returns a vector with NA's as is without changing it. Also, a warning is issued (with a parameter to supress the warning called 'warn')
  • cutree - now supresses warnings produced by sort_levels_values, in the case of NA values.
  • plotNode_horiz now uses "Recall" (I might implement this in more function).
  • tanglegram - added parameters hang and rank_branches.

###BUG FIXES:

  • tanglegram - fixed the right tree's labels position relative to the leaves tips. (they were too far away because of a combination of text_adj with dLeaf)

###VIGNETTES:

  • Fixed the dLeaf in tanglegram plots, and gave an example of using rank_branches.

###NEW FUNCTIONS ADDED:

  • plotNode_horiz - allows the labels, in plot_horiz.dendrogram, to be aligned to the leaves tips when the tree is plotted horizontally, its leaves facing left.

###UPDATED FUNCTIONS:

  • plot_horiz.dendrogram - allows the labels to be aligned to the leaves tips when the tree is plotted horizontally, its leaves facing left. (took a lot of digging into internal functions used by plot.dendrogram)
  • tanglegram - added the parameters: dLeaf_left dLeaf_right. Also, labels are now alligned to the leaves tips in the right dendrogram.

###BUG FIXES:

  • Fix untangle_step_rotate_1side to work with non-missing dend_heights_per_k
  • Set sort_cluster_numbers = TRUE for cutree, in order to make it compatible with stats::cutree. Added a test for this.
  • Fix cutree.hclust to work with a vector of k when !order_clusters_as_data
  • Fix cutree.dendrogram to give default results as stats::hclust does, by setting the default to sort_cluster_numbers = TRUE.

###OTHER NOTES:

  • Variations of the changes to plot_horiz.dendrogram and plotNode_horiz should be added to R core in order to allow forward compatability.

###NEW FUNCTIONS ADDED:

  • untangle_step_rotate_2side

###VIGNETTES NEW SECTIONS ADDED:

  • untangle_forward_rotate_2side

###NEW FUNCTIONS ADDED:

  • shuffle - Random rotation of trees
  • untangle_random_search - random search for two trees with low entanglement.
  • flip_leaves
  • all_couple_rotations_at_k
  • untangle_forward_rotate_1side

###OTHER NOTES:

  • rotate - minor code improvements.

###VIGNETTES NEW SECTIONS ADDED:

  • untangle_random_search
  • untangle_forward_rotate_1side

###NEW FUNCTIONS ADDED:

  • tanglegram - major addition!

  • plot_horiz.dendrogram - Plotting a left-tip-adjusted horizontal dendrogram

  • remove_leaves_nodePar

  • assign_values_to_branches_edgePar

  • remove_branches_edgePar

  • match_order_by_labels

  • match_order_dendrogram_by_old_order - like match_order_by_labels, but faster

  • entanglement

###UPDATED FUNCTIONS:

  • assign_values_to_leaves_nodePar - now makes sure pch==NA if we are modifying a nodePar value which is other than pch (and pch did not exist before).
  • nleaves - now allow the use of the "members" attr of a dendrogram for telling us the size of the tree.

###OTHER NOTES:

  • entanglement.r file added
  • untangle.r file added

###VIGNETTES NEW SECTIONS ADDED:

  • Tanglegram
  • Entanglement

###NEW FUNCTIONS ADDED:

  • tanglegram

###UPDATED FUNCTIONS:

  • rotate - fixes calling the same functions more than once (minor improvements)
  • fac2num - keep_names parameter added
  • intersect_trees - added the "warn" parameter.

###NEW TESTS:

  • order.dendrogram gives warning and can be changed
  • fac2num works

###NEW FUNCTIONS ADDED: (including tests and documentation)

  • is.natural.number

  • cutree_1h.dendrogram - like cutree, but only for 1 height value.

  • fix_members_attr.dendrogram - just to validate that prune works o.k.

  • hang.dendrogram - hangs a dendrogram leaves (also allows for a rotated hanged dendrogram), works also for non-binary trees.

  • nnodes - count the number of nodes in a tree

  • as.dendrogram.phylo - based on as.hclust.

  • get_nodes_attr - allows easy access to attributes of branches and leaves

  • get_branches_heights

  • fix_members_attr.dendrogram

  • heights_per_k.dendrogram - get the heights for a tree that will yield each k cluster.

  • is.hclust

  • is.dendrogram

  • is.phylo

  • fac2num

  • as.phylo.dendrogram - based on as.hclust.

  • cutree_1k.dendrogram - like cutree, but only for 1 k (number of clusters) value.

  • cutree.dendrogram - like cutree but for dendrograms (and it is also vectorized)

  • cutree.hclust - like cutree but for hclust

  • cutree.phylo - like cutree but for phylo

  • sort_levels_values - make the resulting clusters from cutree to be ordered from left to right

  • cutree - with S3 methods for dendrogram/hclust/phylo

  • color_branches - color a tree branches based on its clusters. This is a modified version of the color_clusters function from jefferis's dendroextra package. It extends it by using my own version of cutree.dendrogram - allowing the function to work for trees that hclust can not handle (unrooted and non-ultrametric trees). Also, it allows REPEATED cluster color assignments to branches on to the same tree. Something which the original function was not able to handle. It also handles extreme cases, such as when the labels of a tree are integers.

  • color_labels - just like color_branches, but for labels.

  • assign_values_to_leaves_nodePar - allows for complex manipulation of dendrogram's leaves parameters.

###UPDATED FUNCTIONS:

  • nleaves - added nleaves.phylo methods, based on as.hclust so it could be improved in the future.
  • "labels_colors<-" - fixed it so that by default it would not add phc=1 to the leaves.
  • "order.dendrogram<-" - now returns an integer (instead of numeric)
  • cutree (cutree.dendogram / cutree.hclust) - Prevent R from crashing when using cutree on a subset tree (e.g: dend[[1]])
  • Renaming the unroot function -> to -> unbranch
  • get_leaves_attr - added a simplify parameter.

###OTHER NOTES:

  • Updated the exact way the GPL was stated in DESCRIPTION and gave a better reference within each file.

###VIGNETTES NEW SECTIONS ADDED:

  • Hanging trees
  • Coloring branches.

###NEW FUNCTIONS ADDED:

  • removed "flip", added rev.hclust instead (since rev.dendrogram already exists)

###VIGNETTES NEW SECTIONS ADDED:

  • Vignettes created (using LaTeX)
  • Basic introduction to dendrogram objects
  • Labels extraction and assignment, and measuring tree size.
  • Tree manipulation: unrooting, pruning, label coloring, rotation

###NEW TESTS ADDED:

  • labels extraction, assignment and tree size (especially important for comparing hclust and dendrogram!)
  • Tree manipulation: unrooting, pruning, label coloring, rotation

###UPDATED FUNCTIONS:

  • "labels.hclust" - added the "order" parameter. (based on some ideas from Gregory Jefferis's dendroextras package)
  • "labels.hclust" and "labels.hclust<-" - now both use order=TRUE as default. this makes them consistent with labels.dendrogram. Proper tests have been implemented.
  • "labels<-.dendrogram" - make sure the new dendrogram does not have each of its node of class "dendrogram" (which happens when using dendrapply)
  • unclass_dend - now uses dendrapply
  • get_branches_attr - added "warning" parameter
  • unroot.dendrogram - Can now deal with unrooting more than 3 branches. supresses various warnings.
  • as_hclust_fixed - now works just as as.hclust when hc is missing.
  • rotate - allowed "order" to accept character vector.

###OTHER NOTES:

  • Extending the documentation for: rotate, labels.hclust,
  • Added a welcome massage to when loading the package (zzz.r file added)
  • Added a first template for browseVignettes(package ='dendextend')
  • Added a tests folder - making the foundation for using testthat.
    • Added tests for labels assignment
  • Added a clear GPL-2+ copyright notice on each r file.
  • Forcing {ape} to load before {dendextend}, thus allowing for both rotate and unroot to work for BOTH packages. It does add extra noise when loading the package, but it is the best solution I can think of at this point.

###NEW FUNCTIONS ADDED:

  • count_terminal_nodes
  • labels_colors (retrieving and assignment)
  • unclass_dend
  • head.dendrogram (S3 method for dendrogram)
  • nleaves (with S3 methods for dendrogram and hclust)
  • rotate (with S3 methods for dendrogram, hclust and phylo)
  • sort (with S3 methods for dendrogram and hclust)
  • flip (works for both dendrogram and hclust)
  • prune - prunes leaves off a dendrogram/hclust/phylo trees. (based on the prune_leaf function)
  • as_hclust_fixed
  • get_branches_attr
  • unroot (dendrogram/hclust/phylo)
  • raise.dendrogram
  • flatten.dendrogram
  • order.dendrogram<-
  • intersect_trees

###UPDATED FUNCTIONS:

  • "labels<-.dendrogram" - made sure to allow shorter length of labels than the size of the tree (now uses recycling). This version is now sure to deal correctly with labeling trees with duplicate labels.

###OTHER NOTES:

  • From here on I will be using "." only for S3 method functions. Other functions will use "_"
  • Added more .r files, and changed the locations of some functions.

###NEW FUNCTIONS ADDED:

  • S3 methods for label assignment operator for vector, dendrogram, hclust, matrix.

###OTHER NOTES: * Includes skeletons for some functions that will be added in the future.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("dendextend")

1.5.2 by Tal Galili, 7 months ago


https://cran.r-project.org/package=dendextend, https://github.com/talgalili/dendextend/, https://www.r-statistics.com/tag/dendextend/, https://bioinformatics.oxfordjournals.org/content/31/22/3718


Report a bug at https://github.com/talgalili/dendextend/issues


Browse source code at https://github.com/cran/dendextend


Authors: Tal Galili [aut, cre, cph] (https://www.r-statistics.com), Gavin Simpson [ctb], Gregory Jefferis [aut, ctb] (imported code from his dendroextras package), Marco Gallotta [ctb] (a.k.a: marcog), Johan Renaudie [ctb] (https://github.com/plannapus), The R Core Team [ctb] (Thanks for the Infastructure, and code in the examples), Kurt Hornik [ctb], Uwe Ligges [ctb], Andrej-Nikolai Spiess [ctb], Steve Horvath [ctb], Peter Langfelder [ctb], skullkey [ctb], Mark Van Der Loo [ctb] (https://github.com/markvanderloo d3dendrogram), Andrie de Vries [ctb] (ggdendro author), Zuguang Gu [ctb] (circlize author), Cath [ctb] (https://github.com/CathG), Yoav Benjamini [ths]


Documentation:   PDF Manual  


Task views: Cluster Analysis & Finite Mixture Models, Phylogenetics, Especially Comparative Methods


GPL-2 | GPL-3 license


Imports utils, stats, datasets, magrittr, ggplot2, fpc, whisker, viridis

Suggests knitr, rmarkdown, testthat, seriation, colorspace, plyr, ape, profdpm, microbenchmark, gplots, NMF, heatmaply, d3heatmap, dynamicTreeCut, pvclust, corrplot, DendSer, MASS, cluster, circlize, covr

Enhances ggdendro, labeltodendro, dendroextras, Hmisc, data.table, rpart


Imported by BiBitR, CINNA, MiRAnorm, d3heatmap, factoextra, heatmaply, neatmaps, seriation.

Depended on by EnsCat.

Suggested by CluMix, Rnightlights, circlize, dbscan, pergola, plotly.


See at CRAN