Compute marginal effects from statistical models and returns the result as tidy data frames. These data frames are ready to use with the 'ggplot2'-package. Marginal effects can be calculated for many different models. Interaction terms, splines and polynomial terms are also supported. The main functions are ggpredict(), ggemmeans() and ggeffect(). There is a generic plot()-method to plot the results using 'ggplot2'.

Results of regression models are typically presented as tables that are easy to understand. For more complex models that include interaction or quadratic / spline terms, tables with numbers are less helpful and difficult to interpret. In such cases, *marginal effects* are far easier to understand. In particular, the visualization of marginal effects allows to intuitively get the idea of how predictors and outcome are associated, even for complex models.

**ggeffects** computes marginal effects at the mean or average marginal effects from statistical models and returns the result as tidy data frames. These data frames are ready to use with the **ggplot2**-package.

Please visit https://strengejacke.github.io/ggeffects/ for documentation and vignettes. In case you want to file an issue or contribute in another way to the package, please follow this guide. For questions about the functionality, you may either contact me via email or also file an issue.

Marginal effects can be calculated for many different models. Currently supported model-objects are: `lm`

, `glm`

, `glm.nb`

, `lme`

, `lmer`

, `glmer`

, `glmer.nb`

, `nlmer`

, `glmmTMB`

, `gam`

(package **mgcv**), `vgam`

, `gamm`

, `gamm4`

, `multinom`

, `betareg`

, `truncreg`

, `coxph`

, `gls`

, `gee`

, `plm`

, `lrm`

, `polr`

, `clm`

, `clm2`

, `zeroinfl`

, `hurdle`

, `stanreg`

, `brmsfit`

, `lmRob`

, `glmRob`

, `brglm`

, `rlm`

, `svyglm`

and `svyglm.nb`

. Other models not listed here are passed to a generic predict-function and might work as well, or maybe with `ggeffect()`

, which effectively does the same as `ggpredict()`

.

Interaction terms, splines and polynomial terms are also supported. The two main functions are `ggpredict()`

and `ggeffect()`

. There is a generic `plot()`

-method to plot the results using **ggplot2**.

The returned data frames always have the same, consistent structure and column names, so it's easy to create ggplot-plots without the need to re-write the function call. `x`

and `predicted`

are the values for the x- and y-axis. `conf.low`

and `conf.high`

could be used as `ymin`

and `ymax`

aesthetics for ribbons to add confidence bands to the plot. `group`

can be used as grouping-aesthetics, or for faceting.

`ggpredict()`

requires at least one, but not more than three terms specified in the `terms`

-argument. Predicted values of the response, along the values of the first term are calucalted, optionally grouped by the other terms specified in `terms`

.

```
data(efc)
fit <- lm(barthtot ~ c12hour + neg_c_7 + c161sex + c172code, data = efc)
ggpredict(fit, terms = "c12hour")
#> # Predicted values for Total score BARTHEL INDEX
#> # x = average number of hours of care per week
#>
#> x predicted conf.low conf.high group
#> 0 75.444 73.257 77.630 1
#> 5 74.177 72.098 76.256 1
#> 10 72.911 70.931 74.890 1
#> 15 71.644 69.753 73.535 1
#> 20 70.378 68.564 72.191 1
#> 25 69.111 67.361 70.861 1
#> 30 67.845 66.144 69.545 1
#> 35 66.578 64.911 68.245 1
#> 40 65.312 63.661 66.962 1
#> 45 64.045 62.393 65.697 1
#> ... and 25 more rows.
```

A possible call to ggplot could look like this:

```
library(ggplot2)
mydf <- ggpredict(fit, terms = "c12hour")
ggplot(mydf, aes(x, predicted)) +
geom_line() +
geom_ribbon(aes(ymin = conf.low, ymax = conf.high), alpha = .1)
```

However, there is also a `plot()`

-method. This method uses convenient defaults, to easily create the most suitable plot for the marginal effects.

```
mydf <- ggpredict(fit, terms = "c12hour")
plot(mydf)
```

**ggeffects** has a `plot()`

-method with some convenient defaults, which allows quickly creating ggplot-objects.

With three variables, predictions can be grouped and faceted.

```
ggpredict(fit, terms = c("c12hour", "c172code", "c161sex"))
#> # Predicted values for Total score BARTHEL INDEX
#> # x = average number of hours of care per week
#>
#> x predicted conf.low conf.high group facet
#> 0 74.996 71.406 78.585 low level of education [2] Female
#> 0 73.954 69.354 78.554 low level of education [1] Male
#> 0 75.714 73.313 78.115 intermediate level of education [2] Female
#> 0 74.673 71.055 78.290 intermediate level of education [1] Male
#> 0 76.432 72.887 79.977 high level of education [2] Female
#> 0 75.391 71.040 79.741 high level of education [1] Male
#> 5 73.729 70.219 77.239 low level of education [2] Female
#> 5 72.688 68.143 77.233 low level of education [1] Male
#> 5 74.447 72.146 76.748 intermediate level of education [2] Female
#> 5 73.406 69.846 76.966 intermediate level of education [1] Male
#> ... and 200 more rows.
mydf <- ggpredict(fit, terms = c("c12hour", "c172code", "c161sex"))
ggplot(mydf, aes(x = x, y = predicted, colour = group)) +
stat_smooth(method = "lm", se = FALSE) +
facet_wrap(~facet)
```

`plot()`

works for this case, as well:

```
plot(mydf)
```

There are some more features, which are explained in more detail in the package-vignette.

Please follow this guide if you like to contribute to this package.

To install the latest development snapshot (see latest changes below), type following commands into the R console:

`library(devtools)devtools::install_github("strengejacke/ggeffects")`

Please note the package dependencies when installing from GitHub. The GitHub version of this package may depend on latest GitHub versions of my other packages, so you may need to install those first, if you encounter any problems. Here's the order for installing packages from GitHub:

sjlabelled → sjmisc → sjstats → ggeffects → sjPlot

To install the latest stable release from CRAN, type following command into the R console:

`install.packages("ggeffects")`

In case you want / have to cite my package, please use `citation('ggeffects')`

for citation information.

- Reduce package dependencies.
- Moved package
**effects**from dependencies to suggested packages, due to the restrictive requirements (R >= 3.5). - New
`print()`

-method, with a nicer print of the returned data frame. This method replaces the`summary()`

-method, which was removed. `ggeffect()`

now supports`clm2`

-models from the**ordinal**-package.`ggpredict()`

has improved support for`coxph`

-models from the**survival**-package (survival probabilities, cumulative hazards).

- The
`type`

-argument in`ggpredict()`

now has additional options,`type = "fe.zi"`

and`type = "re.zi"`

, to explicitely condition zero-inflated (mixed) models on their zero-inflation component. - The
`type`

-argument in`ggpredict()`

now has additional options,`type = "surv"`

and`type = "cumhaz"`

, to plot probabilities of survival or cumulative hazards from`coxph`

-models. `ggpredict()`

gets arguments`vcov.fun`

,`vcov.type`

and`vcov.args`

to calculate robust standard errors for confidence intervals of predicted values. These are based on the various`sandwich::vcov*()`

-functions, hence robust standard errors can be calculated for all models that are supported by`sandwich::vcov*()`

.- The
`plot()`

-method gets two arguments`line.size`

and`dot.size`

, to determine the size of the geoms. - The
`ci`

-argument for the`plot()`

-method now also accepts the character values`"dash"`

and`"dot"`

to plot dashed or dotted lines as confidence bands. - The
`terms`

-argument in`ggpredict()`

and`ggeffect()`

may also be a formula, which is more convenient for typing, but less flexible than specifying the terms as character vector with specific options.

- Fixed improper calculation of confidence intervals for hurdle- and zero-inflated models (from package
**pscl**), which could exceed the range of plausible values (e.g. below zero for incidence rates). - Fixed issues with calculation of confidence intervals for mixed models with polynomial terms.

- New vignette
*Different Output between Stata and ggeffects*.

`ggpredict()`

now automatically back-transforms predictions to the response scale for model with log-transformed response.`ggeffect()`

and`ggpredict()`

now automatically set numeric vectors with 10 or more unique values to representative values (see`rprs_values()`

), if these are used as second or third value in the`terms`

-argument (to represent a grouping structure).- Fix memory allocation issue in
`ggeffect()`

. `rprs_values()`

is now exported.- The
`pretty`

-argument is deprecated, because prettifying values almost always makes sense - so this is done automatically. `ggpredict()`

now supports`brmsfit`

-objects with categorical-family.`ggalleffect()`

has been removed.`ggeffect()`

now plots effects for all model terms if`terms = NULL`

.`gginteraction()`

and`ggpoly()`

have been removed, as`ggpredict()`

and`ggeffect()`

are more efficient and generic for plotting interaction or polynomial terms.

- Fix issues with categorical or ordinal outcome models (
`polr`

,`clm`

,`multinom`

) for`ggeffect()`

. - Fix issues with confidence intervals for mixed models with log-transformed response value.
- Fix issues with confidence intervals for generalized mixed models when response value was a rate or proportion created with
`cbind()`

in model formula.

- Removed alias names
`mem()`

,`eff()`

and`ame()`

. - For mixed models (packages
**lme4**,**nlme**,**glmmTMB**), the uncertainty of the random effect variances is now taken into account when`type = "re"`

. - Computing confidence intervals for mixed models should be much more memory efficient now, resulting less often in warnings about memory allocation problems.
- Updated reference in
`CITATION`

to the publication in the Journal of Open Source Software. - A test-suite was added to the package.

`pretty_range()`

, to create a pretty sequence of integers of a vector.

`ggpredict()`

gets a`condition`

-argument to specify values at which covariates should be held constant, instead of their`typical`

value.- The
`pretty`

-option for`ggpredict()`

now calculates more values, leading to smoother plots. - The
`terms`

-argument in`ggpredict()`

can now also select a range of feasible values for numeric values, e.g.`terms = "age [pretty]"`

. In contrast to the`pretty`

-argument, which prettyfies all terms, you can selectively prettify specific terms with this option. - The
`terms`

-argument in`ggpredict()`

now also supports all shortcuts that are possible for the`mdrt.values`

-argument in`gginteraction()`

, so for instance`term = "age [meansd]"`

would return three values: mean(age) - sd(age), mean(age) and mean(age) + sd(age). `plot()`

gets some new arguments to control which plot-title to show or hide:`show.title`

,`show.x.title`

and`show.y.title`

.`plot()`

gets a`log.y`

argument to transform the y-axis to logarithmic scale, which might be useful for binomial models with predicted probabilities, or other models with log-alike link-functions.- The
`plot()`

-method for plotting all effects with`ggpredict()`

(when`term = NULL`

) now allows to arrange the plot in facets (using`facets = TRUE`

). - Values in dot-argument for
`plot()`

are now passed down to`ggplot::scale_y*()`

, to control the appearance of the y-axis (like`breaks`

).

- Fixed issue with binomial models that used
`cbind(...)`

as response variable. - Fixed issue with suboptimal precision of confidence resp. prediction intervals for mixed models (packages
**lme4**,**nlme**), which are now more accurate.

- Prediction for
`glmmTMB`

-objects now compute proper confidence intervals, due to fix in package*glmmTMB*0.2.1 - If
`terms`

in`ggpredict()`

is missing or`NULL`

, marginal effects for each model term are calculated.`ggpredict()`

then returns a list of data frames, which can also be plotted with`plot()`

.

- The
`jitter`

-argument from`plot()`

now accepts a numeric value between 0 and 1, to control the width of the random variation in data points. `ggpredict()`

and`ggeffect()`

can now predict transformed values, which is useful, for instance, to exponentiate predictions for`log(term)`

on the original scale of the variable. See package vignette, section*Marginal effects at specific values or levels*for examples.

- Multivariate response models in
*brms*with variable names with underscores and dots were not correctly plotted.

- Better support for multivariate-response-models from
*brms*. - Support for cumulative-link-models from
*brms*. `ggpredict()`

now supports linear multivariate response models, i.e.`lm()`

with multiple outcomes.

`ggpredict()`

gets a`pretty`

-argument to reduce and "prettify" the value range from variables in`terms`

for predictions. This applies to all variables in`terms`

with more than 25 unique values.

- Recognize negative binomial family from
`brmsfit`

-models.

`ggpredict()`

,`ggeffect()`

and`gginteraction()`

get a`x.as.factor`

-argument to preserve factor-class for the`x`

-column in the returned data frame.- The
`terms`

-argument now also allows the specification of a range of numeric values in square brackets, e.g.`terms = "age [30:50]"`

.

- Give proper warning that
`clm`

-models don't support`full.data`

-argument. `emm()`

did not work properly for some random effects models.

- Use
`convert_case()`

from*sjlabelled*, in preparation for the latest*snakecase*-package update.

- Model weights are now correctly taken into account.