A general framework for constructing variable importance plots from
various types of machine learning models in R. Aside from some standard model-
specific variable importance measures, this package also provides model-
agnostic approaches that can be applied to any supervised learning algorithm.
These include 1) an efficient permutation-based variable importance measure,
2) variable importance based on Shapley values (Strumbelj and Kononenko,
2014)
vip
is an R package for constructing variable importance
plots (VIPs). VIPs are part of a larger framework referred to as
interpretable machine learning (IML), which includes (but not limited
to): partial dependence plots (PDPs) and individual conditional
expectation (ICE) curves. While PDPs and ICE curves (available in the R
package pdp) help visualize
feature effects, VIPs help visualize feature impact (either locally or
globally). An in-progress, but comprehensive, overview of IML can be
found here: https://github.com/christophM/interpretable-ml-book.
# The easiest way to get vip is to install it from CRAN:install.packages("vip")# Alternatively, you can install the development version from GitHub:if (!requireNamespace("devtools")) {install.packages("devtools")}devtools::install_github("koalaverse/vip")
For details and example usage, visit the vip package website.
Added support for Spark (G)LMs.
Bux fixes.
Fixed bug in get_feature_names.ranger()
s.t. it never returns NULL
; it either returns the feature names or throws an error if they cannot be recovered from the model object (#43).
Added pkgdown
site: https://github.com/koalaverse/vip.
Changed truncate_feature_names
argument of vi()
to abbreviate_feature_names
which abbreviates all feature names, rather than just truncating them.
Added CRAN-related badges (#32).
New generic vi_permute()
for constructing permutation-based variable importance scores (#19).
Fixed bug and unnecessary error check in vint()
(#38).
New vignette on using vip
with unsupported models (using the Keras API to TensorFlow as an example).
Added basic sparklyr support.
Added support for XGBoost models (i.e., objects of class "xgb.booster"
).
Added support for ranger models (i.e., objects of class "ranger"
).
Added support for random forest models from the party
package (i.e., objects of class "RandomForest"
).
vip()
gained a new argument, num_features
, for specifying how many variable importance scores to plot. The default is set to 10
.
.
was changed to _
in all argument names.
vi()
gained three new arguments: truncate_feature_names
(for truncating feature names in the returned tibble), sort
(a logical argument specifying whether or not the resulting variable importance scores should be sorted), and decreasing
(a logical argument specifying whether or not the variable importance scores should be sorted in decreasing order).
vi_model.lm()
, and hence vi()
, contains an additional column called Sign
that contains the sign of the original coefficients (#27).
vi()
gained a new argument, scale
, for scaling the variable importance scores so that the largest is 100. Default is FALSE
(#24).
vip()
gained two new arguments, size
and shape
, for controlling the size and shape of the points whenever bar = FALSE
(#9).
Added support for "H2OBinomialModel"
, "H2OMultinomialModel"
, and, "H2ORegressionModel"
objects (#8).