Create Tidy Data Frames of Marginal Effects for 'ggplot' from Model Outputs

Compute marginal effects from statistical models and returns the result as tidy data frames. These data frames are ready to use with the 'ggplot2'-package. Marginal effects can be calculated for many different models. Interaction terms, splines and polynomial terms are also supported. The main functions are ggpredict(), ggemmeans() and ggeffect(). There is a generic plot()-method to plot the results using 'ggplot2'.


DOI    CRAN_Status_Badge    Documentation    Build Status    codecov

Why marginal effects?

Results of regression models are typically presented as tables that are easy to understand. For more complex models that include interaction or quadratic / spline terms, tables with numbers are less helpful and difficult to interpret. In such cases, marginal effects are far easier to understand. In particular, the visualization of marginal effects allows to intuitively get the idea of how predictors and outcome are associated, even for complex models.

Aim of this package

ggeffects computes marginal effects at the mean or average marginal effects from statistical models and returns the result as tidy data frames. These data frames are ready to use with the ggplot2-package.

Documentation and Support

Please visit https://strengejacke.github.io/ggeffects/ for documentation and vignettes. In case you want to file an issue or contribute in another way to the package, please follow this guide. For questions about the functionality, you may either contact me via email or also file an issue.

ggeffects supports many different models and is easy to use

Marginal effects can be calculated for many different models. Currently supported model-objects are: betareg, brglm, brmsfit, clm, clm2, clmm, coxph, gam (package mgcv), Gam (package gam), gamm, gamm4, gee, glm, glm.nb, glmer, glmer.nb, glmmTMB, glmmPQL, glmRob, gls, hurdle, ivreg, lm, lm_robust, lme, lmer, lmRob, lrm, MixMod, MCMCglmm, multinom, nlmer, plm, polr, rlm, stanreg, svyglm, svyglm.nb, truncreg, vgam, zeroinfl and zerotrunc. Other models not listed here are passed to a generic predict-function and might work as well, or maybe with ggeffect() or ggemmeans(), which effectively do the same as ggpredict().

Interaction terms, splines and polynomial terms are also supported. The main functions are ggpredict(), ggemmeans() and ggeffect(). There is a generic plot()-method to plot the results using ggplot2.

Examples

The returned data frames always have the same, consistent structure and column names, so it's easy to create ggplot-plots without the need to re-write the function call. x and predicted are the values for the x- and y-axis. conf.low and conf.high could be used as ymin and ymax aesthetics for ribbons to add confidence bands to the plot. group can be used as grouping-aesthetics, or for faceting.

ggpredict() requires at least one, but not more than three terms specified in the terms-argument. Predicted values of the response, along the values of the first term are calucalted, optionally grouped by the other terms specified in terms.

data(efc)
fit <- lm(barthtot ~ c12hour + neg_c_7 + c161sex + c172code, data = efc)

ggpredict(fit, terms = "c12hour")
#> 
#> # Predicted values of Total score BARTHEL INDEX 
#> # x = average number of hours of care per week 
#> 
#>    x predicted std.error conf.low conf.high
#>    0    75.444     1.116   73.257    77.630
#>   15    71.644     0.965   69.753    73.535
#>   35    66.578     0.851   64.911    68.245
#>   50    62.779     0.852   61.108    64.449
#>   70    57.713     0.970   55.811    59.614
#>   85    53.913     1.122   51.713    56.113
#>  100    50.113     1.309   47.547    52.680
#>  120    45.047     1.591   41.929    48.166
#>  135    41.248     1.817   37.686    44.810
#>  170    32.382     2.373   27.732    37.033
#> 
#> Adjusted for:
#> *  neg_c_7 = 11.84
#> *  c161sex =  1.76
#> * c172code =  1.97

A possible call to ggplot could look like this:

library(ggplot2)
mydf <- ggpredict(fit, terms = "c12hour")
ggplot(mydf, aes(x, predicted)) +
  geom_line() +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = .1)

However, there is also a plot()-method. This method uses convenient defaults, to easily create the most suitable plot for the marginal effects.

mydf <- ggpredict(fit, terms = "c12hour")
plot(mydf)

ggeffects has a plot()-method with some convenient defaults, which allows quickly creating ggplot-objects.

With three variables, predictions can be grouped and faceted.

ggpredict(fit, terms = c("c12hour", "c172code", "c161sex"))
#> 
#> # Predicted values of Total score BARTHEL INDEX 
#> # x = average number of hours of care per week 
#> 
#> # low level of education
#> # [1] Male
#>    x predicted std.error conf.low conf.high
#>    0    73.954     2.347   69.354    78.554
#>   45    62.556     2.208   58.228    66.883
#>   85    52.424     2.310   47.896    56.951
#>  170    30.893     3.085   24.847    36.939
#> 
#> # low level of education
#> # [2] Female
#>    x predicted std.error conf.low conf.high
#>    0    74.996     1.831   71.406    78.585
#>   45    63.597     1.603   60.456    66.738
#>   85    53.465     1.702   50.130    56.800
#>  170    31.934     2.606   26.827    37.042
#> 
#> # intermediate level of education
#> # [1] Male
#>    x predicted std.error conf.low conf.high
#>    0    74.673     1.845   71.055    78.290
#>   45    63.274     1.730   59.883    66.665
#>   85    53.142     1.911   49.397    56.887
#>  170    31.611     2.872   25.982    37.241
#> 
#> # intermediate level of education
#> # [2] Female
#>    x predicted std.error conf.low conf.high
#>    0    75.714     1.225   73.313    78.115
#>   45    64.315     0.968   62.418    66.213
#>   85    54.183     1.209   51.815    56.552
#>  170    32.653     2.403   27.943    37.362
#> 
#> # high level of education
#> # [1] Male
#>    x predicted std.error conf.low conf.high
#>    0    75.391     2.220   71.040    79.741
#>   45    63.992     2.176   59.727    68.258
#>   85    53.860     2.364   49.226    58.494
#>  170    32.330     3.257   25.946    38.713
#> 
#> # high level of education
#> # [2] Female
#>    x predicted std.error conf.low conf.high
#>    0    76.432     1.809   72.887    79.977
#>   45    65.034     1.712   61.679    68.388
#>   85    54.902     1.910   51.158    58.646
#>  170    33.371     2.895   27.697    39.045
#> 
#> Adjusted for:
#> * neg_c_7 = 11.84

mydf <- ggpredict(fit, terms = c("c12hour", "c172code", "c161sex"))
ggplot(mydf, aes(x = x, y = predicted, colour = group)) +
  stat_smooth(method = "lm", se = FALSE) +
  facet_wrap(~facet)

plot() works for this case, as well:

plot(mydf)

There are some more features, which are explained in more detail in the package-vignette.

Contributing to the package

Please follow this guide if you like to contribute to this package.

Installation

Latest development build

To install the latest development snapshot (see latest changes below), type following commands into the R console:

library(devtools)
devtools::install_github("strengejacke/ggeffects")

Please note the package dependencies when installing from GitHub. The GitHub version of this package may depend on latest GitHub versions of my other packages, so you may need to install those first, if you encounter any problems. Here's the order for installing packages from GitHub:

sjlabelledsjmiscsjstatsggeffectssjPlot

Officiale, stable release

CRAN_Status_Badge    Documentation    Build Status    downloads    total

To install the latest stable release from CRAN, type following command into the R console:

install.packages("ggeffects")

Citation

In case you want / have to cite my package, please use citation('ggeffects') for citation information.

DOI

News

ggeffects 0.9.0

General

  • Minor revisions to docs and vignettes.
  • Reduce package dependencies.
  • Better support, including confidence intervals, for some of the already supported model types.
  • New package-vignette Customize Plot Appearance.

New supported models

  • ggemmeans() now supports type = "fe.zi" for glmmTMB-models, i.e. predicted values are conditioned on the fixed effects and the zero-inflation components of glmmTMB-models.
  • ggpredict() now supports MCMCglmm, ivreg and MixMod (package GLMMadaptive) models.
  • ggemmeans() now supports MCMCglmm and MixMod (package GLMMadaptive) models.
  • ggpredict() now computes confidence intervals for gam models (package gam).

New functions

  • new_data(), to create a data frame from all combinations of predictor values. This data frame typically can be used for the newdata-argument in predict(), in case it is necessary to quickly create an own data frame for this argument.

Changes to functions

  • ggpredict() no longer stops when predicted values with confidence intervals for glmmTMB- and other zero-inflated models can't be computed with type = "fe.zi", and only returns the predicted values without confidence intervals.
  • When ggpredict() fails to compute confidence intervals, a more informative error message is given.
  • plot() gets a connect.lines-argument, to connect dots from plots with discrete x-axis.

Bug fixes

  • ggpredict() did not work with glmmTMB- and other zero-inflated models, when type = "fe.zi" and model- or zero-inflation formula had a polynomial term that was held constant (i.e. not part of the terms-argument).
  • Confidence intervals for zero-inflated models and type = "fe.zi" could not be computed when the model contained polynomial terms and a very long formula (issue with deparse(), cutting off very long formulas).
  • The plot()-method put different spacing between groups when a numeric factor was used along the x-axis, where the factor levels where non equal-spaced.
  • Minor fixes regarding calculation of predictions from some already supported models
  • Fixed issues with multiple response models of class lm in ggeffects().
  • Fixed issues with encoding in help-files.

ggeffects 0.8.0

General

  • Minor changes to meet forthcoming changes in purrr.
  • For consistency reasons, both type = "fe" and type = "re" return population-level predictions for mixed effects models (lme4, glmmTMB). The difference is that type = "re" also takes the random effect variances for prediction intervals into account. Predicted values at specific levels of random effect terms is described in the package-vignettes Marginal Effects for Random Effects Models and Marginal Effects at Specific Values.
  • Revised docs and vignettes.
  • Give more informative warning for misspelled variable names in terms-argument.
  • Added custom (pre-defined) color-palettes, that can be used with plot(). Use show_pals() to show all available palettes.
  • Use more appropriate calculation for confidence intervals of predictions for model with zero-inflation component.

New supported models

  • ggpredict() and ggeffect() now support brms-models with additional response information (like trial()).
  • ggpredict() now supports Gam, glmmPQL, clmm, and zerotrunc-models.
  • All models supported by the emmeans should also work with the new ggemmeans()-function. Since this function is quite new, there still might be some bugs, though.

New functions

  • ggemmeans() to compute marginal effects by calling emmeans::emmeans().
  • theme_ggeffects(), which can be used with ggplot2::theme_set() to set the ggeffects-theme as default plotting theme. This makes it easier to add further theme-modifications like sjPlot::legend_style() or sjPlot::font_size().

Changes to functions

  • Added prediction-type based on simulations (type = "sim") to ggpredict(), currently for models of class glmmTMB and merMod.
  • x.cat is a new alias for the argument x.as.factor.
  • The plot()-method gets a ci.style-argument, to define different styles for the confidence bands for numeric x-axis-terms.
  • The print()-method gets a x.lab-argument to print value labels instead of numeric values if x is categorical.
  • emm() now also supports all prediction-types, like ggpredict().

Bug fixes

  • Fixed issue where confidence intervals could not be computed for variables with very small values, that differ only after the second decimal part.
  • Fixed issue with ggeffect(), which did not work if data had variables with more that 8 digits (fractional part longer than 8 numbers).
  • Fixed issue with multivariate response models fitted with brms or rstanarm when argument ppd = TRUE.
  • Fixed issue with glmmTMB-models for type = "fe.zi", which could mess up the correct order of predicted values for x.
  • Fixed minor issue with glmmTMB-models for type = "fe.zi" or type = "re.zi", when first terms had the [all]-tag.
  • Fixed minor issue in the print()-method for mixed effects models, when predictions were conditioned on all model terms and adjustment was only done for random effects (output-line "adjusted for").
  • Fixed issue for mixed models, where confidence intervals were not completely calculated, if terms included a factor and contrasts were set to other values than contr.treatment.
  • Fixed issue with messed up order of confidence intervals for glm-object and heteroskedasticity-consistent covariance matrix estimation.
  • Fixed issue for glmmTMB-models, when variables in dispersion or zero-inflation formula did not appear in the fixed effects formula.
  • The condition-argument was not always considered for some model types when calculating confidence intervals for predicted values.

ggeffects 0.7.0

General

  • Support for monotonic predictors in brms models (mo()).
  • For generalized additive models, values for splines are no longer automatically prettified (which ensures smooth plots, without the need to use the [all] tag, i.e. terms="... [all]").
  • If splines or plolynomial terms are used, a message is printed to indicate that using the [all] tag, i.e. terms="... [all]", will produce smoother plots.
  • The package-vignette Marginal Effects at Specific Values now has examples on how to get marginal effects for each group level of random effects in mixed models.
  • Revised print()-method, that - for larger data frames - only prints representative data rows. Use the n-argument inside the print()-method to force a specific number of rows to be printed.

Changes to functions

  • Added an n-tag for the terms-argument in ggpredict() and ggeffect(), to give more flexibility according to how many values are used for "prettifying" large value ranges.
  • Added a sample-tag for the terms-argument in ggpredict() and ggeffect(), to pick a random sample of values for plotting.
  • ggpredict() and ggeffect() now also return the standard error of predictions, if available.
  • The jitter-argument in plot() now also changes the amount of noise for plots of models with binary outcome (when rawdata = TRUE).

Bug fixes

  • Fix issue with proper calculation of random effect variances for glmmTMB models for type="re" and type="re.zi" in general, and also for models with ar1 random effects structure.

ggeffects 0.6.0

General

  • Reduce package dependencies.
  • Moved package effects from dependencies to suggested packages, due to the restrictive requirements (R >= 3.5).
  • New print()-method, with a nicer print of the returned data frame. This method replaces the summary()-method, which was removed.
  • ggeffect() now supports clm2-models from the ordinal-package.
  • ggpredict() has improved support for coxph-models from the survival-package (survival probabilities, cumulative hazards).

Changes to functions

  • The type-argument in ggpredict() now has additional options, type = "fe.zi" and type = "re.zi", to explicitely condition zero-inflated (mixed) models on their zero-inflation component.
  • The type-argument in ggpredict() now has additional options, type = "surv" and type = "cumhaz", to plot probabilities of survival or cumulative hazards from coxph-models.
  • ggpredict() gets arguments vcov.fun, vcov.type and vcov.args to calculate robust standard errors for confidence intervals of predicted values. These are based on the various sandwich::vcov*()-functions, hence robust standard errors can be calculated for all models that are supported by sandwich::vcov*().
  • The plot()-method gets two arguments line.size and dot.size, to determine the size of the geoms.
  • The ci-argument for the plot()-method now also accepts the character values "dash" and "dot" to plot dashed or dotted lines as confidence bands.
  • The terms-argument in ggpredict() and ggeffect() may also be a formula, which is more convenient for typing, but less flexible than specifying the terms as character vector with specific options.

Bug fixes

  • Fixed improper calculation of confidence intervals for hurdle- and zero-inflated models (from package pscl), which could exceed the range of plausible values (e.g. below zero for incidence rates).
  • Fixed issues with calculation of confidence intervals for mixed models with polynomial terms.

ggeffects 0.5.0

General

  • New vignette Different Output between Stata and ggeffects.

Changes to functions

  • ggpredict() now automatically back-transforms predictions to the response scale for model with log-transformed response.
  • ggeffect() and ggpredict() now automatically set numeric vectors with 10 or more unique values to representative values (see rprs_values()), if these are used as second or third value in the terms-argument (to represent a grouping structure).
  • Fix memory allocation issue in ggeffect().
  • rprs_values() is now exported.
  • The pretty-argument is deprecated, because prettifying values almost always makes sense - so this is done automatically.
  • ggpredict() now supports brmsfit-objects with categorical-family.
  • ggalleffect() has been removed. ggeffect() now plots effects for all model terms if terms = NULL.
  • gginteraction() and ggpoly() have been removed, as ggpredict() and ggeffect() are more efficient and generic for plotting interaction or polynomial terms.

Bug fixes

  • Fix issues with categorical or ordinal outcome models (polr, clm, multinom) for ggeffect().
  • Fix issues with confidence intervals for mixed models with log-transformed response value.
  • Fix issues with confidence intervals for generalized mixed models when response value was a rate or proportion created with cbind() in model formula.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.