Measuring Multivariate Dependence Using Distance Multivariance
Distance multivariance is a measure of dependence which can be used to detect
and quantify dependence of arbitrarily many random vectors. The necessary functions are
implemented in this packages and examples are given. It includes: distance multivariance,
distance multicorrelation, dependence structure detection, tests of independence and
copula versions of distance multivariance based on the Monte Carlo empirical transform.
Detailed references are given in the package description, as starting point for the
theoretic background we refer to:
B. Böttcher, Dependence and Dependence Structures: Estimation and Visualization Using
the Unifying Concept of Distance Multivariance. Open Statistics, Vol. 1, No. 1 (2020),
(News above the first "Changes...." are not displayed properly in Rstudio, i.e., this line is invisible)
Changes in Version 2.0.0
- new function 'multivariance.test' which provides all multivariance related tests - providing a unified interface with return values as they are common for tests in R, in particular, the p.value und the value of the test statistic. The return value is of class "htest" (as it is standard for other hypothesis tests, e.g. ks.test, t.test).
- 'cdm' has now the argument "external.dm.fun" which can be used to pass an external function for the computation of the distance matrix (allowing major speed ups for non standard distances)
- 'multivariances.all' has now named return values
- 'resample.multivariance' works now also with 'type="all"', for simultaneous computation of p.values of multivariance, total-multivariance, 2-multivariance and 3-multivariance
- 'multicorrelation' computes now various types of multicorrelations.
- 'multivariance.timing' provides methods for detailed estimation of the computation time, which might be useful e.g. when planing simulation studies.
- 'multicorrelation' has now different defaults and new arguments.
- the centered distance matrices are now stored in a list rather than a 3-dim array. Thus the return value of 'cdms' was changed, and correspondingly the arguments of all '*.multivariance' functions.
- major speedup
- 'multivariance.pvalue' accepts and returns NA and NaN
- 'm.multivariance' returns NA when 3-multivariance is used for only 2 variables.
- 'pearson.pvalue' partially ignored the option "type". It always used the test statistic of multivariance, despite the fact the parameters were computed for the given "type".
- the option "verbose" in 'dependence.structure' now works as expected
Changes in Version 1.2.1
- updated references
- various typos corrected
Changes in Version 1.2.0
- 'independence.test' is now also implemented with type "pearson_approx". Providing the fast p-value approximation developed in arXiv:1808.07280. For this also the functions 'pearson.qf' (a Gaussian quadratic form estimate based on mean, variance and skewness) and 'pearson.pvalue' (the corresponding p-value estimate based on new moment estimators) are introduced.
- In "cmd" one can now explicitly specify the use of "isotropic" continuous negative definite functions. This speeds up the calculation for this case by a factor of about 100.
- the option "squared" works now also for multivariance with option "correlation=TRUE".
- 'multivariances.all' returns NA for 3-multivariance if only two variables are given.
- speed up of various functions
- various typos corrected
Changes in Version 1.1.0
- 'm.multivariance' a function to calculate the m-multivariance
- 'multivariances.all' a function to calculate standard/total/m-multivariance simultaneously
- 'resample.multivariance' implements the resampling method which can be used to get less conservative tests than the distribution-free methods
- 'dependence.structure' a function to generate a graphical model of the dependence structure
- various examples of the use of 'dependence.structure'
- The standard output of 'multivariance' is now (distance multivariance squared) scaled by the sample size. Use 'Nscale = FALSE' to get the value without this scaling. The reason for this was twofold: 1. it is now the same setting as for 'total.multivariance'. 2. This is the only value which can (roughly) be interpreted without further calculations.
- improved documentation. In particular, it is now clearly stated that the squared values are the standard output of 'multivariance' and 'total.multivariance'
- some speed up
Changes in Version 1.0.5 2017-11-01