An interface to the 'Open Tree of Life' API to retrieve phylogenetic trees, information about studies used to assemble the synthetic tree, and utilities to match taxonomic names to 'Open Tree identifiers'. The 'Open Tree of Life' aims at assembling a comprehensive phylogenetic tree for all named species.
rotl
is an R package to interact with the Open Tree of Life data APIs. It was
initially developed as part of the
NESCENT/OpenTree/Arbor hackathon.
Client libraries to interact with the Open Tree of Life API also exists for Python and Ruby.
The current stable version is available from CRAN, and can be installed by typing the following at the prompt in R:
install.packages("rotl")
If you want to test the development version, you first need to install
the remotes
package.
install.packages("remotes")
Then you can install rotl
using:
remotes::install_github("ropensci/rotl")
There are three vignettes:
Start by checking out the "How to use rotl
?" by typing:
vignette("how-to-use-rotl", package="rotl")
after installing the
package.
Then explore how you can use rotl
with other packages to combine your data
with trees from the Open Tree of Life project by typing:
vignette("data_mashups", package="rotl")
.
The vignette "Using the Open Tree Synthesis in a comparative analsysis"
demonstrates how you can reproduce an analysis of a published paper by
downloading the tree they used, and data from the supplementary material:
vignette("meta-analysis", package="rotl")
.
The vignettes are also available from CRAN:
How to use rotl
?,
Data mashups,
and
Using the Open Tree synthesis in a comparative analysis.
Taxonomic names are represented in the Open Tree by numeric identifiers, the
ott_ids
(Open Tree Taxonomy identifiers). To extract a portion of a tree from
the Open Tree, you first need to find ott_ids
for a set of names using the
tnrs_match_names
function:
library(rotl)apes <- c("Pongo", "Pan", "Gorilla", "Hoolock", "Homo")(resolved_names <- tnrs_match_names(apes))
## search_string unique_name approximate_match ott_id is_synonym flags
## 1 pongo Pongo FALSE 417949 FALSE
## 2 pan Pan FALSE 417957 FALSE
## 3 gorilla Gorilla FALSE 417969 FALSE
## 4 hoolock Hoolock FALSE 712902 FALSE
## 5 homo Homo FALSE 770309 FALSE
## number_matches
## 1 2
## 2 2
## 3 1
## 4 1
## 5 1
Now we can get the tree with just those tips:
tr <- tol_induced_subtree(ott_ids=ott_id(resolved_names))plot(tr)
The code above can be summarized in a single pipe:
library(magrittr)
##
## Attaching package: 'magrittr'
## The following objects are masked from 'package:testthat':
##
## equals, is_less_than, not
## or expressed as a pipe:c("Pongo", "Pan", "Gorilla", "Hoolock", "Homo") %>% tnrs_match_names %>% ott_id %>% tol_induced_subtree %>% plot
To cite rotl
in publications pleases use:
interact with the Open Tree of Life data. Methods in Ecology and Evolution. 7(12):1476-1481. doi: 10.1111/2041-210X.12593
You may also want to cite the paper for the Open Tree of Life
Hinchliff, C. E., et al. (2015). Synthesis of phylogeny and taxonomy into a comprehensive tree of life. Proceedings of the National Academy of Sciences 112.41 (2015): 12764-12769 doi: 10.1073/pnas.1423041112
The manuscript in Methods in Ecology and Evolution includes additional examples on how to use the package. The manuscript and the code it contains are also hosted on GitHub at: https://github.com/fmichonneau/rotl-ms
Starting with v3.0.0 of the package, the major and minor version numbers (the first 2 digits of the version number) will be matched to those of the API. The patch number (the 3rd digit of the version number) will be used to reflect bug fixes and other changes that are independent from changes to the API.
rotl
can be used to access other versions of the API (if they are available)
but most likely the high level functions will not work. Instead, you will need
to parse the output yourself using the "raw" returns from the unexported
low-level functions (all prefixed with a .
). For instance to use the
tnrs/match_names
endpoint for v2
of the API:
rotl:::.tnrs_match_names(c("pan", "pango", "gorilla", "hoolock", "homo"), otl_v="v2")
Please note that this project is released with a Contributor Code of Conduct. By participating in this project you agree to abide by its terms.
tnrs_match_names
are
consistent, and remain the same even after using update()
.get_study_subtree
gains the argument tip_label
to control the
formatting of the tip labels, #90, reported by @bomearais_in_tree
takes a list of OTT ids (i.e., the output of
ott_id()
), and returns a vector of logical indiicating whether they are
included in the synthetic tree (workaround #31).get_study_subtree
ignored the argument subtree_id
, #89
reported by @bomearacitation("rotl")
now includes the reference to the Open Tree of Life
publication.Fix tests and vignette to reflect changes accompanying release 6.1 of the synthetic tree
Add section in vignette "How to use rotl?" about how to get the higher taxonomy from a given taxon.
Add CITATION
file with MEE manuscript information (#82)
rotl
now interacts with v3.0 of the Open Tree of Life APIs. The
documentation has been updated to reflect the associated changes. More
information about the v3.0 of the Open Tree of Life APIs can be found
on their wiki.New methods: tax_sources
, is_suppressed
, tax_rank
, unique_name
,
name
, ott_id
, for objects returned by tnrs_match_names()
,
taxonomy_taxon_info()
, taxonomy_taxon_mrca()
, tol_node_info()
,
tol_about()
, and tol_mrca()
. Each of these methods have their own class.
New method tax_lineage()
to extract the higher taxonomy from an object
returned by taxonomy_taxon_info()
(initally suggested by Matt Pennell, #57).
New method tol_lineage()
to extract the nodes towards the root of the tree.
New print methods for tol_node_info()
and tol_mrca()
.
New functions study_external_IDs()
and taxon_external_IDs()
that return
the external identifiers for a study and associated trees (e.g., DOI, TreeBase
ID); and the identifiers of taxon names in taxonomic databases. The vignette
"Data mashup" includes an example on how to use it.
The function strip_ott_id()
gains the argument remove_underscores
to remove
underscores from tips in trees returned by OTL.
Rename method ott_taxon_name()
to tax_name()
for consistency.
Rename method synth_sources()
and study_list()
to source_list()
.
Refactor how result of query is checked and parsed (invisible to the user).
Fix bug in studies_find_studies()
, the arguments verbose
and exact
were
ignored.
The argument only_current
has been dropped for the methods associated with
objects returned by tnrs_match_names()
The print method for tnrs_context()
duplicated some names.
inspect()
, update()
and synonyms()
methods for tnrs_match_names()
did
not work if the query included unmatched taxa.
New vignette: meta-analysis
Added arguments include_lineage
and list_terminal_descendants
to
taxonomy_taxon()
Improve warning and format of the result if one of the taxa requested doesn't
match anything tnrs_match_names
.
In the data frame returned by tnrs_match_names
, the columns
approximate_match
, is_synonym
and is_deprecated
are now logical
(instead of character
) [issue #54]
New utility function strip_ott_ids
removes OTT id information from
a character vector, making it easier to match tip labels in trees returned by
tol_induced_subtree
to taxonomic names in other data sources. This function
can also remove underscores from the taxon names.
New method list_trees
returns a list of tree ids associated with
studies. The function takes the output of studies_find_studies
or
studies_find_trees
.
studies_find_studies
and studies_find_trees
gain argument detailed
(default set to TRUE
), that produces a data frame summarizing information
(title of the study, year of publication, DOI, ids of associated trees, ...)
about the studies matching the search criteria.
get_study_tree
gains argument deduplicate
. When TRUE
, if the tree
returned for a given study contains duplicated tip labels, they will be made
unique before being parsed by NCL by appending a suffix (_1
, _2
, _3
,
etc.). (#46, reported by @bomeara)
New method get_study_year
for objects of class study_meta
that returns the
year of publication of the study.
A more robust approach is used by get_tree_ids
to identify the tree ids in
the metadata returned by the API