Measurement and partitioning of diversity, based on Tsallis entropy, following Marcon and Herault (2015)

entropart is an R package that provides functions to calculate alpha, beta and gamma diversity of communities, including phylogenetic and functional diversity.

Estimation-bias corrections are available.

In the entropart package, individuals of different *species* are counted in several *communities* which may (or not)
be agregated to define a *metacommunity*.
In the metacommunity, the probability to find a species in the weighted average of probabilities in communities.
This is a naming convention, which may correspond to plots in a forest inventory or any data organized the same way.

Basic functions allow computing diversity of a community.
Data is simply a vector of probabilities (summing up to 1) or of abundances (integer values that are numbers of individuals).
Calculate entropy with functions such as *Tsallis*, *Shannon*, *Simpson*, *Hurlbert* or *GenSimpson*
and explicit diversity (i.e. effective number of species) with *Diversity* and others.
By default, the best available estimator of diversity will be used, according to the data.

Communities can be simulated by *rCommunity*, explicitely declared as a species distribution (*as.AbdVector* or *as.ProbaVector*),
and plotted.

Phylogenetic entropy and diversity can be calculated if a phylogenetic (or functional), ultrametric tree is provided.
See *PhyloEntropy*, *Rao* for examples of entropy and *PhyloDiversity* to calculate phylodiversity,
with the state-of-the-art estimation-bias correction.
Similarity-based diversity is calculated with *Dqz*, based on a similarity matrix.

A quick introduction is in `vignette("entropart")`

.

A full documentation is available online, in the "Articles" section of the web site of the vignette. It is a continuous update of the paper published in the Journal of Statistical Software (Marcon & HÃ©rault, 2015).

The development version documentation is also available.

Marcon, E. and Herault, B. (2015). entropart: An R Package to Measure and Partition Diversity.
*Journal of Statistical Software*. 67(8): 1-26.

- Estimation of diversity at a chosen level (sample size or coverage).
`DivAccum()`

function. - Entropy accumulation functions.
- ggplot2 supported.
`autoplot()`

methods added for entropart objects. - The "Best" estimator of diversity is now "UnveilJ" and the default estimator of richness is "Jackknife".
- The "ChaoWangJost" estimator is renamed "ChaoJost".

- Unit tests added.
- Vignette by pkgdown.

- The jaccknife estimator of richness returned an error for communities where all species had the same abundance.
`Richness`

returned 0 instead of 1 for a community with a single species.

- On Travis now.
- Reduced package size.
- The rule to calculate the number of individuals of MetaCommunities has been changed to improve gamma diversity bias correction. See the user manual vignette.
- Generic function arguments cleaned up.

- Very large metacommunities returned an integer overflow error.

`HqzBeta()`

returned erroneous values if a species probability was equal to zero.

- On GitHub now.
- Documentation updated: phylogenetic dendrograms can be of class
`phylo`

,`phylog`

,`hclust`

or`PPtree`

whatever the function. - The introduction vignette is HTML now.
- A new vignette is dedicated to phylogenies.

- Argument checking (
`CheckArguments = TRUE`

) is not possible when the package is not loaded and a function is called by`entropart::function()`

. An error was returned. It is replaced by a warning.

- Explicit export of all non-internal functions instead of
`exportPattern("^[[:alpha:]]+")`

- Updated references to published articles.
- Updated
`help("entropart")`

. - New introduction vignette.
- Vignettes compiled with
*knitr*instead of*Sweave*.

- LazyData is used to save memory.
- Better reporting of the argument names in embedded calls of functions.

- The simulation of log-series communities was incorrect.

- Generalized Simpson's entropy and diversity added (
`GenSimpson`

and`GenSimpsonD`

). `Originality.Species()`

is deprecated because it is pointless.`ade4::originality()`

allows calculating it for q=2. Leinster (2009) and Leinster and Meckes (2015) showed that`Originality.Species()`

does not depend on the order of diversity.

- ZhangGrabchak estimator of entropy is now calculated by the C code of
`EntropyEstimation::Tsallis.z`

/`Entropy.z`

rather than the R code of`bcTsallis()`

. This is much faster when the number of individual is high. Applies to`ChaoWangJost`

(Best) estimator too.

`DivProfile()`

now allows computing bootstrap confidence intervals.

- The entropy estimation (of order different from 1) of a distribution with no singleton returned
`NA`

with`ChaoWangJost`

correction. Reported by Zach Marion. Only partly corrected in Version 1.4-1. Corrected. `DivEst`

returned incorrect beta diversity if q was not 1. Corrected.

- All scalar values of diversity or entropy are now named. Their name is the bias correction used to obtain them.
- The
`Unveiled`

estimator is more versatile.`Correction = "Unveil"`

is deprecated and replaced by`UnveilC`

,`UnveiliC`

or`UnveilJ`

in functions such as`Tsallis()`

or`Diversity()`

.

- Parallelization of
`DivProfile()`

,`CommunityProfile()`

and`PhyloApply()`

using the parallel package*mclapply*. No effect on Windows, pretty much faster on other systems. - Extensive use of
`vapply()`

instead of`sapply()`

makes some functions faster. `AllenH()`

and`ChaoPD()`

returned`NA`

if the tree contained more species than the probability vector. Now, the tree may be pruned or kept unchanged and extra species considered to have probabilities 0.

- Using
`phylog`

trees in`AllenH`

and`ChaoPD()`

returned erroneous unnormalized diversity (divided by two) because of the conVersion of`phylog`

to`htree`

divides branch lengths by two. Corrected. - The richness estimator
`iChao1`

returned`NA`

if the distibution contained singletons but no doubletons. Corrected.

`phylog`

objects (deprecated in*ade4*) are replaced by`phylo`

trees from package*ape*in the definition of the`PPtree`

class. Issues caused by`phylog`

such as replacing`.`

and`-`

by`_`

in species names do not occur any longer.`phylog`

trees are still accepted for compatibility.`ChaoPD()`

and`AllenH()`

now accept`phylo`

trees.`Richness`

now returns a named value. The name contains the estimator used.- Updated
*CITATION*: the paper about this package has been published: Eric Marcon, Bruno Herault (2015). entropart: An R Package to Measure and Partition Diversity.*Journal of Statistical Software*, 67(8), 1-26.

- The entropy estimation of a distribution with no singleton returned
`NA`

with`ChaoWangJost`

correction. Corrected. - Entropy or diversity of a vector of zeros returned 0. It now returns
`NA`

.

- Abundance and probability vector objects. See
`?SpeciesDistribution`

. - Hurlbert diversity. See
`?Hurlbert`

. `Optimal.Similarity`

.- Miller-Madow estimator of entropy (Miller, 1955) added in
`bcShannon()`

. - Chao and Jost (2015) estimator of diversity added in
`bcTsallis()`

and`bcDiversity()`

. New "best" estimator. - Chao et al. (2015) probability estimation of observed species. See
`?TunedPs`

. - Estimators of the number of species. See
`?Richness`

. - Abundance Frequency Count of species. See
`?AbdFreqCount`

. - Community profiles can be calculated with confidence intervals. See
`?CommunityProfile`

. - Random Communities. See
`?rCommunity`

.

- Applying
`bcTsallis`

and similar functions with a probability vector instead of abundance values could cause errors depending in the correction. Correction is now forced to`None`

with a warning. - Allowed rounding error was too small on some systems (typically r-patched-solaris-sparc) to recognize probability vectors. The difference between their sum and 1 had to be less than 3 times
`.Machine$double.eps`

. Now set to S times (where S is the number of species, i.e the vector's length).

- Zhang and Grabchak (2014) bias correction for Shannon beta entropy added.
- Unbiased estimator of Rao's entropy added (
`bcRao`

).

`DqZ()`

and`Hqz()`

returned an error if all probability values were 0 except one.

- Improved readability of error messages for bad arguments.
- Improved formating of
`summmary.DivPart()`

. Lines were too long. - Improved legend for the x-axis of
`plot.DivPart`

("alpha and gamma" instead of "alpha/gamma"). - Improved support of
`PhyloValue`

objects (summary added). - Improved help for
`MetaCommunity`

.

`ChaoPD()`

returned an incorrect value when q=0 and some probabilities =0.

- Full support of similarity-based diversity added

- Default values for arguments added whenever possible.

- Zhang(2012) bias correction for Shannon entropy added.
- Zhang and Grabchak (2014) bias correction for Tsallis entropy added.

`Divest()`

always calculated neutral diversity of simulated communities so the confidence interval was erroneous for phylodiversity. Corrected.

`Paracou618.dist`

distance matrix between species of`Paracou618.MC`

added.- Chao, Wang and Jost (2013) bias correction for Shannon entropy added.
`EntropyCI`

function added: Entropy of Monte-Carlo simulated communities.- Tools to manipulate MetaCommunity objects added (see
`?MergeMC`

). `SimTest`

class added to test a value against a simulated distribution (see`?SimTest`

).- Vignette added.

`Imports`

directive rather than`Depends`

for*ade4*.`mergeandlabel`

does not return warnings any longer (column names are better addressed).

`Hqz()`

was erroneous for q<>1. Corrected.`bcPhyloEntropy()`

and`bcPhyloDiversity()`

returned an incorrect`$Distribution`

component. Corrected.`summary.MCentropy()`

did not return the name of the tree. Corrected.- Legend was not displayed in
`plot.DivProfile(..., Which="Communities")`

. Corrected.

- First Version.