Simulates and plots quantities of interest (relative hazards, first differences, and hazard ratios) for linear coefficients, multiplicative interactions, polynomials, penalised splines, and non-proportional hazards, as well as stratified survival curves from Cox Proportional Hazard models. It also simulates and plots marginal effects for multiplicative interactions.
https://github.com/christophergandrud/simPH/issues
simPH is an R package for simulating and plotting quantities of interest (relative hazards, first differences, and hazard ratios) for linear coefficients, multiplicative interactions, polynomials, penalised splines, and non-proportional hazards, as well as stratified survival curves from Cox Proportional Hazard models.
For more information plus examples, please see the description paper in the Journal of Statistical Software.
To cite the paper please use:
@article{simPH_JSS,
author = {Christopher Gandrud},
title = {simPH: An R Package for Illustrating Estimates from Cox
Proportional Hazard Models Including for Interactive and Nonlinear
Effects},
journal = {Journal of Statistical Software},
year = {2015},
volume = {65},
issue = {3},
pages = {1--20}
}
The package includes the following functions:
coxsimLinear
: a function for simulating relative hazards, first differences,
hazard ratios, and hazard rates for linear, non-time interacted covariates from
Cox Proportional Hazard models.
coxsimtvc
: a function for simulating time interactive hazards (relative
hazards, first differences, and hazard ratios) for covariates from Cox
Proportional Hazard models. The function will calculate time-interactive hazard
ratios for multiple strata estimated from a stratified Cox Proportional Hazard
model.
coxsimSpline
: a function for simulating quantities of interest from
penalised splines using multivariate normal distributions. Currently does not
support simulating hazard rates from stratified models. Note: be extremely
careful about the number of simulations you ask the function to find. It is very
easy to ask for more than your computer can handle.
coxsimPoly
: a function for simulating quantities of interest for a range of
values for a polynomial nonlinear effect from Cox Proportional Hazard models.
coxsimInteract
: a function for simulating quantities of interest for linear
multiplicative interactions, including marginal effects and hazard rates.
Results from these functions can be plotted using the simGG
method. The
syntax and capabilities of simGG
varies depending on the sim
object class
you are using:
simGG.simlinear
: plots simulated linear time-constant hazards using
ggplot2.
simGG.simtvc
: uses ggplot2 to graph the simulated time-varying relative
hazards, first differences, hazard ratios or stratified hazard rates.
simGG.simspline
: uses ggplot2 to plot
quantities of interest from simspline
objects, including relative hazards,
first differences, hazard ratios, and hazard rates.
simGG.simpoly
: uses ggplot2 to graph the simulated polynomial quantities
of interest.
simGG.siminteract
: uses ggplot2 to graph linear multiplicative
interactions.
Because in almost all cases simGG
returns a ggplot2 object, you can add
additional aesthetic attributes in the normal ggplot2 way. See the
ggplot2 documentation for more details.
SurvExpand
: a function for converting a data frame of non-equal interval
continuous observations into equal interval continuous observations. This is
useful to do before creating time interactions.
tvc
: a function for creating time interactions. Currently supports
'linear'
, natural 'log'
, and exponentiation ('power'
).
setXl
: a function for setting valid Xl
values given a sequence of fitted
Xj
values. This makes it more intuitive to find hazard ratios and first
differences for comparisons between some Xj fitted values and Xl values other
than 0.
ggfitStrata
: a function to plot fitted stratified survival curves estimated
from survfit
using ggplot2. This function builds on the survival
package's plot.survfit
function. One major advantage is the ability to split
the survival curves into multiple plots and arrange them in a grid. This makes
it easier to examine many strata at once. Otherwise they can be very bunched up.
MinMaxLines
: a function for summarising the constricted intervals from the
simulations, including the median, upper and lower bounds and
the middle 50% of these intervals.
The package is available on CRAN and can be installed in the normal R way.
To install the development version use the
devtools function install_github
. Here
is the code for installing the most recent development version:
devtools::install_github('christophergandrud/simPH')
Before running the simulation and graph functions in this package carefully consider how many simulations you are about to make. Especially for hazard rates over long periods of time and with multiple strata, you can be asking simPH to run very many simulations. This will be computationally intensive.
For more information about simulating parameter estimates to make interpretation of results easier see:
Licht, Amanda A. 2011. “Change Comes with Time: Substantive Interpretation of Nonproportional Hazards in Event History Analysis.” Political Analysis 19: 227–43.
King, Gary, Michael Tomz, and Jason Wittenberg. 2000. “Making the Most of Statistical Analyses: Improving Interpretation and Presentation.” American Journal of Political Science 44(2): 347–61.
For more information about stratified Cox PH models (and frailties, which I am working to incorporate in future versions) see:
Box-Steffensmeier, Janet M, and Suzanna De Boef. 2006. “Repeated Events Survival Models: the Conditional Frailty Model.” Statistics in Medicine 25(20): 3518–33.
To learn more about shortest probability intervals (and also for the source of the code that made this possible in simPH) see:
Liu, Y., Gelman, A., & Zheng, T. (2015). "Simulation-efficient Shortest Probablility Intervals." Statistics and Computing 25:809-819.
Also good: Hyndman, R. J. (1996). "Computing and Graphing Highest Density Regions." The American Statistician, 50(2): 120–126.
For more information about interpreting interaction terms:
Brambor, Thomas, William Roberts Clark, and Matt Golder. 2006. “Understanding Interaction Models: Improving Empirical Analyses.” Political Analysis 14(1): 63–82.
For an example of how non-proportional hazard results were often presented before simPH see (some of the problems I encountered in this paper were a major part of why I'm developing this package):
Gandrud, Christopher. 2013. “The Diffusion of Financial Supervisory Governance Ideas.” Review of International Political Economy. 20(4): 881-916.
I intend to expand the quantities of interest that can be simulated and graphed for Cox PH models. I am also currently working on functions that can simulate and graph hazard ratios estimated from Fine and Gray competing risks models.
I am also working on a way to graph hazard ratios with frailties.
Licensed under GPL-3
Fixed a bug when including rug plots with simGG.siminteract
. Thanks to
Ting Shuo Huang for reporting.
Corrected a bug in simGG.siminteract
and simGG.simlinear
where the
shape
parameter was incorrectly supplied to geom_line
.
Enable compatability with dplyr version 0.4.4.
Fixes bugs when finding hazard rates.
Minor internal code readability improvements.
Xl
message
is only shown if Xl
is incorrectly specified.Fixed issues with ggfitStrata
and its documentation caused by changes to
the survival and gridExtra packages.
Minor documentation change removing links to ggplot2 documenation that no longer exists from version 2.0.0.
melt
is loaded from the data.table package rather than
reshape2 (which is no longer a dependency). This should provide speed
improvements.coxsimInteract
can now handle interactions with categorical variables.SurvExpand
that improves speed and stability.Add citation to JSS article.
Gandrud, Christopher. 2015. simPH: An R Package for Illustrating Estimates from Cox Proportional Hazard Models Including for Interactive and Nonlinear Effects. Journal of Statistical Software. 65(3)1-20.
!!!!NOTE!!!!: This version alters default behaviour and functionality in a way that could break your code.
simGG
now includes rug plots.
simGG.simtvc
rug plots are not produced. All x-axes
are "Time".simGG
default type is now ribbons
, rather than points
.
For simGG
plots with a covariate represented on the x-axis,
the variable name is printed by default on the plot.
as.data.frame.coxsim
added to convert the output of a coxsim function to
a data frame
simGG.simspline
no longer supports scatter3d plots for Hazard Rates.
Bug fixes in simGG.simspline
with Hazard Rates.
Minor internal code and documentation improvements.
Trivial README.md change to address CRAN Markdown processor issue.
No longer depends on DataCombine.
Minor documentation improvements.
coxsimSpline
where the SimID
was not correctly identified.Addressed minor bug in SurvExpand
from data.table.
Minor changes to SurvExpand
to work with dplyr 0.3.
Now requires dplyr 0.3.
coxsim objects now also given the class data.frame
.
No longer depends on plyr.
Added PartialData
argument to SurvExpand
. The argument allows the user to
determine whether or not to only keep the expanded data needed to find the Cox
partial likelihood.
Made MinMaxLines
a top level function that is useful for returning basic
summary statistics from coxsim constricted simulation intervals. Using the
argument clean = TRUE
returns the simulations' medians, the minimum and
maximum of the constricted intervals (as set in the coxsim
call) and the lower
and upper 50% of the constricted intervals.
MinMaxLines
also now relies on dplyr rather than plyr. This improves
performance.
!!!! smoother
argument for simGG
is deprecated. Use method
instead.
The functionality is exactly the same. !!!!
tvc
now accepts a vector of variable names.
jss-example demo added.
simPH-overview vignette added.
extremesDrop
argument added to coxsim
functions. This drops simulated
quantity of interest values that are Inf/NA/NaN/>1000000. These can create
problems with plotting and finding the spin
.
Small utility setXl
created that makes it easier to set Xl
values.
Automatic bspline
cleaning features added for coxsimSpline
.
Minor documentation improvements.
Internal code cleaning.
coxsimPoly
is able to simulate quantities of interest for only the polynomial
terms without the linear component. Feature request from Mattia Valente.
Minor error message improvements.
Added SurvExpand
for converting a data frame of non-equal interval continuous
observations into equal interval continuous observations. This is very useful if
you intend to create time interactions. Thanks to Mintao Nie and an anonymous
reviewer for the suggestion.
Roxyegen documentation improvements.
Added type
argument to simGG.spline
. Allows the user to plot using
points
, ribbons
, or lines
. NOTE: the ribbons
standalone argument is
depricated.
Added SmoothSpline
argument to simGG.simspline
to use smoothing splines on
the simulations. Creates a smoother graph.
Uses mvrnorm
from the MASS
package instead of rmultnorm
to draw the
simulations.
Documentation and other internal improvements.
coxsimInteract
with the expMarg
argument.simGG.siminteract
.Added legend
argument to allow the user to hide plot legends, when applicable.
Minor aesthetic updates and documentation clarifications.
Added the argument ribbons
to the simGG method. This produces a plot with
shaded areas ('ribbons') for the minimum and maximum simulation values as well
as the central 50% of this area. It also plots a line for the median value of
this area. (Thanks to Jeff Chwieroth for the suggestion.)
Internal improvements to minimise the size of simulation output objects and improve performance if qi = "Hazard Rate".
A number of bug fixes.
coxsimSpline
if white spaces are not entered
before and after equal (=) signs in the bspline
argument.Added package vignette (partially completed).
Expanded coxsimPoly
so that it is capable of simulating other quantities of
interest. Also, bug fixes.
spin = TRUE
works for quantities of interest when Xj - Xl = 0, i.e. in
situations when all of the simulated quantities are 1 (or 0 for First
Differences).
Documentation improvements.
Bug fixes, including:
Increased flexibility for setting confidence levels. Now they may be set at any numeric value from 0 through 1.
Choice of using confidence levels for the middle simulation values or the shortest probability interval.
qi
now automatically determined by simGG
.
means
argument added to coxsimLinear
and coxsimInteract. This allows the user to choose if they would like Hazard Rates to be fitted using the variables (other than the variables of interest) set to their means rather than 0. Note: it does not currently support models that include polynomials created by
I`.
means
will be added to the other simulation commands in future versions.Minor bug fixes and documentation updates.
Major update to the way simPH plots simulated objects. Instead of using
separate commands for plotting objects of different sim classes it now uses the
method simGG
.
In practical terms this means that you can now just use the command simGG
rather than the old gg. . . commands.
Minor bug fix for ci
argument.
Minor change: now line drawn.
Standardise how hazard rates are calculated.
Made updates so that the package is compatible with data.table package version 1.8.8.
Minor improvement to ggtvc legend
Minor bug fixes.
Updated the syntax for simcoxtvc
and ggtvc
for hazard ratios and stratified
hazard rates so that it matches the syntax for the other commands.
Other bug fixes.
Minor bug fix for ggspline
when qi == 'First Difference'
.
Added coxsimSpline
and ggspline
to simulate and plot quantities of interest
for penalised splines.
Minor bug fixes, documentation improvements for coxsimInteract
.
Improved error messages in coxsimInteract
and minor documentation changes.
Added coxsimInteract
to simulate quantities of interest for linear
multiplicative interactions and gginteract
for plotting these simulations.
Also made an important fix to how coxsimLinear
calculates hazard rates and how
gglinear
plots these simulations.
Other documentation fixes.
Minor change to how coxsimtvc
runs so that it is no longer dependent on
reshape.
Updated documentation and added the ability to change the smoothing line colour for first difference and relative hazard plots.
Added functions for simulating and plotting linear non-time-varying hazards.
coxsimLinear
: simulates linear non-time-varying hazards
gglinear
: plots linear non-time-varying-hazards
First version, largely ported from simtvc version 0.04 (http://christophergandrud.github.com/simtvc/), with the addition of the ability to work with polynomials. This includes two functions
coxsimPoly
simulates polynomial relative hazards
ggpoly
graphs the simulated polynomial relative hazards