Exploratory Data Analysis using Random Forests

Functions useful for exploratory data analysis using random forests which can be used to compute multivariate partial dependence, observation, class, and variable-wise marginal and joint permutation importance as well as observation-specific measures of distance (supervised or unsupervised). All of the aforementioned functions are accompanied by 'ggplot2' plotting functions.


Functions useful for exploratory data analysis using random forests.

This package extends the functionality of random forests fit by party (multivariate, regression, and classification), randomForestSRC (regression and classification,), randomForest (regression and classification), and ranger (classification and regression).

The subdirectory pkg contains the actual package. The package can be installed with devtools.

devtools::install_github("zmjones/edarf", subdir = "pkg")

Functionality includes:

  • partial_dependence which computes the expected prediction made by the random forest if it were marginalized to only depend on a subset of the features. plot_pd plots the results.
  • variable_importance which computes feature importance for arbitrary loss functions, aggregated across the training data or for individual observations. This may also be used for subsets of the feature space in order to detect interactions.
  • extract_proximity and plot_prox which computes or extracts proximity matrices and plots them using a biplot given a matrix of principal components of said matrix.

If you use the package for research, please cite it.

@article{jones2016,
  doi = {10.21105/joss.00092},
  url = {http://dx.doi.org/10.21105/joss.00092},
  year  = {2016},
  month = {oct},
  publisher = {The Open Journal},
  volume = {1},
  number = {6},
  author = {Zachary M. Jones and Fridolin J. Linder},
  title = {edarf: Exploratory Data Analysis using Random Forests},
  journal = {The Journal of Open Source Software}
}

Pull requests, bug reports, feature requests, etc. are welcome!

News

edarf_0.1:

  • initial cran release

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("edarf")

1.1.1 by Zachary M. Jones, 3 months ago


Report a bug at https://github.com/zmjones/edarf


Browse source code at https://github.com/cran/edarf


Authors: Zachary M. Jones <zmj@zmjones.com> and Fridolin Linder <fridolin.linder@gmail.com>


Documentation:   PDF Manual  


MIT + file LICENSE license


Imports data.table, ggplot2, mmpf

Suggests party, randomForest, randomForestSRC, ranger, testthat, rmarkdown, knitr


See at CRAN