Create Tidy Data Frames of Marginal Effects for 'ggplot' from Model Outputs

Compute marginal effects from statistical models and returns the result as tidy data frames. These data frames are ready to use with the 'ggplot2'-package. Marginal effects can be calculated for many different models. Interaction terms, splines and polynomial terms are also supported. The main functions are ggpredict(), ggemmeans() and ggeffect(). There is a generic plot()-method to plot the results using 'ggplot2'.


DOI

Why marginal effects?

Results of regression models are typically presented as tables that are easy to understand. For more complex models that include interaction or quadratic / spline terms, tables with numbers are less helpful and difficult to interpret. In such cases, marginal effects are far easier to understand. In particular, the visualization of marginal effects allows to intuitively get the idea of how predictors and outcome are associated, even for complex models.

Aim of this package

ggeffects computes marginal effects at the mean or average marginal effects from statistical models and returns the result as tidy data frames. These data frames are ready to use with the ggplot2-package.

Documentation and Support

Please visit https://strengejacke.github.io/ggeffects/ for documentation and vignettes. In case you want to file an issue or contribute in another way to the package, please follow this guide. For questions about the functionality, you may either contact me via email or also file an issue.

ggeffects supports many different models and is easy to use

Marginal effects can be calculated for many different models. Currently supported model-objects are: lm, glm, glm.nb, lme, lmer, glmer, glmer.nb, nlmer, glmmTMB, gam (package mgcv), vgam, gamm, gamm4, multinom, betareg, truncreg, coxph, gls, gee, plm, lrm, polr, clm, clm2, zeroinfl, hurdle, stanreg, brmsfit, lmRob, glmRob, brglm, rlm, svyglm and svyglm.nb. Other models not listed here are passed to a generic predict-function and might work as well, or maybe with ggeffect(), which effectively does the same as ggpredict().

Interaction terms, splines and polynomial terms are also supported. The two main functions are ggpredict() and ggeffect(). There is a generic plot()-method to plot the results using ggplot2.

Examples

The returned data frames always have the same, consistent structure and column names, so it's easy to create ggplot-plots without the need to re-write the function call. x and predicted are the values for the x- and y-axis. conf.low and conf.high could be used as ymin and ymax aesthetics for ribbons to add confidence bands to the plot. group can be used as grouping-aesthetics, or for faceting.

ggpredict() requires at least one, but not more than three terms specified in the terms-argument. Predicted values of the response, along the values of the first term are calucalted, optionally grouped by the other terms specified in terms.

data(efc)
fit <- lm(barthtot ~ c12hour + neg_c_7 + c161sex + c172code, data = efc)

ggpredict(fit, terms = "c12hour")

#> # Predicted values for Total score BARTHEL INDEX 
#> # x = average number of hours of care per week 
#> 
#>   x predicted conf.low conf.high group
#>   0    75.444   73.257    77.630     1
#>   5    74.177   72.098    76.256     1
#>  10    72.911   70.931    74.890     1
#>  15    71.644   69.753    73.535     1
#>  20    70.378   68.564    72.191     1
#>  25    69.111   67.361    70.861     1
#>  30    67.845   66.144    69.545     1
#>  35    66.578   64.911    68.245     1
#>  40    65.312   63.661    66.962     1
#>  45    64.045   62.393    65.697     1
#>  ... and 25 more rows.

A possible call to ggplot could look like this:

library(ggplot2)
mydf <- ggpredict(fit, terms = "c12hour")
ggplot(mydf, aes(x, predicted)) +
  geom_line() +
  geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = .1)

However, there is also a plot()-method. This method uses convenient defaults, to easily create the most suitable plot for the marginal effects.

mydf <- ggpredict(fit, terms = "c12hour")
plot(mydf)

ggeffects has a plot()-method with some convenient defaults, which allows quickly creating ggplot-objects.

With three variables, predictions can be grouped and faceted.

ggpredict(fit, terms = c("c12hour", "c172code", "c161sex"))

#> # Predicted values for Total score BARTHEL INDEX 
#> # x = average number of hours of care per week 
#> 
#>  x predicted conf.low conf.high                           group      facet
#>  0    74.996   71.406    78.585          low level of education [2] Female
#>  0    73.954   69.354    78.554          low level of education   [1] Male
#>  0    75.714   73.313    78.115 intermediate level of education [2] Female
#>  0    74.673   71.055    78.290 intermediate level of education   [1] Male
#>  0    76.432   72.887    79.977         high level of education [2] Female
#>  0    75.391   71.040    79.741         high level of education   [1] Male
#>  5    73.729   70.219    77.239          low level of education [2] Female
#>  5    72.688   68.143    77.233          low level of education   [1] Male
#>  5    74.447   72.146    76.748 intermediate level of education [2] Female
#>  5    73.406   69.846    76.966 intermediate level of education   [1] Male
#>  ... and 200 more rows.

mydf <- ggpredict(fit, terms = c("c12hour", "c172code", "c161sex"))
ggplot(mydf, aes(x = x, y = predicted, colour = group)) +
  stat_smooth(method = "lm", se = FALSE) +
  facet_wrap(~facet)

plot() works for this case, as well:

plot(mydf)

There are some more features, which are explained in more detail in the package-vignette.

Contributing to the package

Please follow this guide if you like to contribute to this package.

Installation

Latest development build

To install the latest development snapshot (see latest changes below), type following commands into the R console:

library(devtools)
devtools::install_github("strengejacke/ggeffects")

Please note the package dependencies when installing from GitHub. The GitHub version of this package may depend on latest GitHub versions of my other packages, so you may need to install those first, if you encounter any problems. Here's the order for installing packages from GitHub:

sjlabelledsjmiscsjstatsggeffectssjPlot

Officiale, stable release

CRAN_Status_Badge    downloads    total

To install the latest stable release from CRAN, type following command into the R console:

install.packages("ggeffects")

Citation

In case you want / have to cite my package, please use citation('ggeffects') for citation information.

DOI

News

ggeffects 0.6.0

General

  • Reduce package dependencies.
  • Moved package effects from dependencies to suggested packages, due to the restrictive requirements (R >= 3.5).
  • New print()-method, with a nicer print of the returned data frame. This method replaces the summary()-method, which was removed.
  • ggeffect() now supports clm2-models from the ordinal-package.
  • ggpredict() has improved support for coxph-models from the survival-package (survival probabilities, cumulative hazards).

Changes to functions

  • The type-argument in ggpredict() now has additional options, type = "fe.zi" and type = "re.zi", to explicitely condition zero-inflated (mixed) models on their zero-inflation component.
  • The type-argument in ggpredict() now has additional options, type = "surv" and type = "cumhaz", to plot probabilities of survival or cumulative hazards from coxph-models.
  • ggpredict() gets arguments vcov.fun, vcov.type and vcov.args to calculate robust standard errors for confidence intervals of predicted values. These are based on the various sandwich::vcov*()-functions, hence robust standard errors can be calculated for all models that are supported by sandwich::vcov*().
  • The plot()-method gets two arguments line.size and dot.size, to determine the size of the geoms.
  • The ci-argument for the plot()-method now also accepts the character values "dash" and "dot" to plot dashed or dotted lines as confidence bands.
  • The terms-argument in ggpredict() and ggeffect() may also be a formula, which is more convenient for typing, but less flexible than specifying the terms as character vector with specific options.

Bug fixes

  • Fixed improper calculation of confidence intervals for hurdle- and zero-inflated models (from package pscl), which could exceed the range of plausible values (e.g. below zero for incidence rates).
  • Fixed issues with calculation of confidence intervals for mixed models with polynomial terms.

ggeffects 0.5.0

General

  • New vignette Different Output between Stata and ggeffects.

Changes to functions

  • ggpredict() now automatically back-transforms predictions to the response scale for model with log-transformed response.
  • ggeffect() and ggpredict() now automatically set numeric vectors with 10 or more unique values to representative values (see rprs_values()), if these are used as second or third value in the terms-argument (to represent a grouping structure).
  • Fix memory allocation issue in ggeffect().
  • rprs_values() is now exported.
  • The pretty-argument is deprecated, because prettifying values almost always makes sense - so this is done automatically.
  • ggpredict() now supports brmsfit-objects with categorical-family.
  • ggalleffect() has been removed. ggeffect() now plots effects for all model terms if terms = NULL.
  • gginteraction() and ggpoly() have been removed, as ggpredict() and ggeffect() are more efficient and generic for plotting interaction or polynomial terms.

Bug fixes

  • Fix issues with categorical or ordinal outcome models (polr, clm, multinom) for ggeffect().
  • Fix issues with confidence intervals for mixed models with log-transformed response value.
  • Fix issues with confidence intervals for generalized mixed models when response value was a rate or proportion created with cbind() in model formula.

ggeffects 0.4.0

General

  • Removed alias names mem(), eff() and ame().
  • For mixed models (packages lme4, nlme, glmmTMB), the uncertainty of the random effect variances is now taken into account when type = "re".
  • Computing confidence intervals for mixed models should be much more memory efficient now, resulting less often in warnings about memory allocation problems.
  • Updated reference in CITATION to the publication in the Journal of Open Source Software.
  • A test-suite was added to the package.

New functions

  • pretty_range(), to create a pretty sequence of integers of a vector.

Changes to functions

  • ggpredict() gets a condition-argument to specify values at which covariates should be held constant, instead of their typical value.
  • The pretty-option for ggpredict() now calculates more values, leading to smoother plots.
  • The terms-argument in ggpredict() can now also select a range of feasible values for numeric values, e.g. terms = "age [pretty]". In contrast to the pretty-argument, which prettyfies all terms, you can selectively prettify specific terms with this option.
  • The terms-argument in ggpredict() now also supports all shortcuts that are possible for the mdrt.values-argument in gginteraction(), so for instance term = "age [meansd]" would return three values: mean(age) - sd(age), mean(age) and mean(age) + sd(age).
  • plot() gets some new arguments to control which plot-title to show or hide: show.title, show.x.title and show.y.title.
  • plot() gets a log.y argument to transform the y-axis to logarithmic scale, which might be useful for binomial models with predicted probabilities, or other models with log-alike link-functions.
  • The plot()-method for plotting all effects with ggpredict() (when term = NULL) now allows to arrange the plot in facets (using facets = TRUE).
  • Values in dot-argument for plot() are now passed down to ggplot::scale_y*(), to control the appearance of the y-axis (like breaks).

Bug fixes

  • Fixed issue with binomial models that used cbind(...) as response variable.
  • Fixed issue with suboptimal precision of confidence resp. prediction intervals for mixed models (packages lme4, nlme), which are now more accurate.

ggeffects 0.3.4

General

  • Prediction for glmmTMB-objects now compute proper confidence intervals, due to fix in package glmmTMB 0.2.1
  • If terms in ggpredict() is missing or NULL, marginal effects for each model term are calculated. ggpredict() then returns a list of data frames, which can also be plotted with plot().

Changes to functions

  • The jitter-argument from plot() now accepts a numeric value between 0 and 1, to control the width of the random variation in data points.
  • ggpredict() and ggeffect() can now predict transformed values, which is useful, for instance, to exponentiate predictions for log(term) on the original scale of the variable. See package vignette, section Marginal effects at specific values or levels for examples.

Bug fixes

  • Multivariate response models in brms with variable names with underscores and dots were not correctly plotted.

ggeffects 0.3.3

General

  • Better support for multivariate-response-models from brms.
  • Support for cumulative-link-models from brms.
  • ggpredict() now supports linear multivariate response models, i.e. lm() with multiple outcomes.

Changes to functions

  • ggpredict() gets a pretty-argument to reduce and "prettify" the value range from variables in terms for predictions. This applies to all variables in terms with more than 25 unique values.

Bug fixes

  • Recognize negative binomial family from brmsfit-models.

ggeffects 0.3.2

General

  • ggpredict(), ggeffect() and gginteraction() get a x.as.factor-argument to preserve factor-class for the x-column in the returned data frame.
  • The terms-argument now also allows the specification of a range of numeric values in square brackets, e.g. terms = "age [30:50]".

Bug fixes

  • Give proper warning that clm-models don't support full.data-argument.
  • emm() did not work properly for some random effects models.

ggeffects 0.3.1

General

  • Use convert_case() from sjlabelled, in preparation for the latest snakecase-package update.

Bug fixes

  • Model weights are now correctly taken into account.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.