Fits the joint model proposed by Henderson and colleagues (2000)
joineRML
is an extension of the joineR package for fitting joint models of time-to-event data and multivariate longitudinal data. The model fitted in joineRML is an extension of the Wulfsohn and Tsiatis (1997) and Henderson et al. (2000) models, which is comprised of (K + 1)-sub-models: a Cox proportional hazards regression model (Cox, 1972) and a K-variate linear mixed-effects model - a direct extension of the Laird and Ware (1982) regression model. The model is fitted using a Monte Carlo Expectation-Maximization (MCEM) algorithm, which closely follows the methodology presented by Lin et al. (2002).
As noted in Hickey et al. (2016), there is a lack of statistical software available for fitting joint models to multivariate longitudinal data. This is contrary to a growing methodology in the statistical literature. joineRML
is intended to fill this void.
The main workhorse function is mjoint
. As a simple example, we use the heart.valve
dataset from the package and fit a bivariate joint model.
library(joineRML)data(heart.valve)hvd <- heart.valve[!is.na(heart.valve$log.grad) & !is.na(heart.valve$log.lvmi), ]set.seed(12345)fit <- mjoint(formLongFixed = list("grad" = log.grad ~ time + sex + hs,"lvmi" = log.lvmi ~ time + sex),formLongRandom = list("grad" = ~ 1 | num,"lvmi" = ~ time | num),formSurv = Surv(fuyrs, status) ~ age,data = list(hvd, hvd),timeVar = "time")
The fitted model is assigned to fit
. We can apply a number of functions to this object, e.g. coef
, logLik
, plot
, print
, ranef
, fixef
, summary
, AIC
, getVarCov
, vcov
, confint
, sigma
, update
, formula
, resid
, and fitted
. In addition, several special functions have been added, including dynSurv
, dynLong
, and baseHaz
, as well as plotting functions for objects inheriting from the dynSurv
, dynLong
, ranef
, and mjoint
functions. For example,
summary(fit)plot(fit, param = 'gamma')
mjoint
automatically estimates approximate standard errors using the empirical information matrix (Lin et al., 2002), but the bootSE
function can be used as an alternative.
If you spot any errors or wish to see a new feature added, please file an issue at https://github.com/graemeleehickey/joineRML/issues or email Graeme Hickey.
For an overview of the model estimation being performed, please see the technical vignette, which can be accessed by
vignette('technical', package = 'joineRML')
For a demonstration of the package, please see the introductory vignette, which can be accessed by
vignette('joineRML', package = 'joineRML')
This project is funded by the Medical Research Council (Grant number MR/M013227/1).
To install the latest developmental version, you will need R version (version 3.3.0 or higher) and some additional software depending on what platform you are using.
If not already installed, you will need to install Rtools. Choose the version that corresponds to the version of R that you are using.
If not already installed, you will need to install Xcode Command Line Tools. To do this, open a new terminal and run
$ xcode-select --install
The latest developmental version will not yet be available on CRAN. Therefore, to install it, you will need devtools
. You can check you are using the correct version by running
Once the prerequisite software is installed, you can install joineRML
by running the following command in an R console
library('devtools')install_github('graemeleehickey/joineRML')
Cox DR. Regression models and life-tables. J R Stat Soc Ser B Stat Methodol. 1972; 34(2): 187-220.
Henderson R, Diggle PJ, Dobson A. Joint modelling of longitudinal measurements and event time data. Biostatistics. 2000; 1(4): 465-480.
Hickey GL, Philipson P, Jorgensen A, Kolamunnage-Dona R. Joint modelling of time-to-event and multivariate longitudinal outcomes: recent developments and issues. BMC Med Res Methodol. 2016; 16(1): 117.
Laird NM, Ware JH. Random-effects models for longitudinal data. Biometrics. 1982; 38(4): 963-974.
Lin H, McCulloch CE, Mayne ST. Maximum likelihood estimation in the joint analysis of time-to-event and multiple longitudinal variables. Stat Med. 2002; 21: 2369-2382.
Wulfsohn MS, Tsiatis AA. A joint model for survival and longitudinal data measured with error. Biometrics. 1997; 53(1): 330-339.
Added Zenodo DOI badge to README.
Added ORCID IDs for authors.
Remove unused objects once finished with them to try and reduce memory overheads.
Changed maintainer to Pete Philipson.
Added smoothed predicted survival curves to the plot.dynSuv()
. Smoothing is based on the contrainsed B-splines method.
dynSurv()
now includes an argument to specify a horizon time from the last known observation time.
simData()
now includes an argument to choose multivariate t-distributed random effects with varying degrees of freedom, thus allowing for sensitivity analyses of heavier tail distributions.
Minor corrections to documentation.
Minor bug fixes to plotting functions.
Added MRC to DESCRIPTION as funder.
Added Depsy badge to README.
The MCEM algorithm can be implemented with vanilla Monte Carlo and quasi-Monte Carlo (using either the scrambled Sobol sequence or the Halton sequence). This is implemented through the type
argument passed to the list of control
parameters in mjoint()
.
bootSE()
now has option to use parallel computing via the foreach
package.
Fixed some small errors in the epileptic.qol
dataset.
Fixed situation where a tibble might be given as the dataset (#55 @ellessenne).
Catches errors in bootstrap due to "bad" data and automatically restarts the bootstrap (#57 @ellessenne)
Added hex sticker badge.
Moved the make files and raw data for qol
and renal
datasets into the ~/data-raw/
directory.
Added @ellessenne as package contributor for testing and bug corrections (PRs #55 and #57).
Added new functions dynSurv()
and dynLong()
, which generates survival probabilities and xpected longitudinal predictions, respectively, for a new subject conditional on their last measurement time and longitudial history. Prediction can be implemented using either a first order approximation or a Monte Carlo simulation approach.
Added an associated print()
method for dynSurv
and dynLong
objects.
Added an associated plot()
method for dynSurv
and dynLong
objects.
Added a function baseHaz()
for extracting the centered and uncetered estimates of the baseline hazard function.
print()
and summary()
now report the total computation time in addition to just the EM algorithm time. This was deemed useful after some examples showed that the time to get initial values was more expensive than the time for the MCEM algorithm to converge.
Fixed a bug that prevented models being fitted with no covariates in the survival sub-model, i.e. Surv() ~ 1
.
Correction to the vignette description of mjoint()
arguments.
Removed enumintem
package for Sweave vignette to satisfy CRAN checks on macOS (release).
Updated Makevars
and Markvars.win
to allow for OpenMP, which can be used by RcppArmadillo.
Minor tidy-up of in-code comments.
Minor updates and corrections to documentation.
Added unit tests for new features.
Changed the technical vignette to Rnw (with engine = Sweave) from ltx (with engine = R.rsp) in attempt to remove some CMD check warnings.
Shortened the vignette to make it compile quicker (removed execution of bootstrapping).
Lots of tweaks of minor formatting tweaks in the documentation.
Fixed a bug that prevented factors with >2 levels being included in the time-to-event sub-model.
Fixed package registration, which strangely broke on R 3.4.0 for OSX platform.
joineRML
version 0.2.1 will depend on R version >=3.3.0 to remedy issue with sigma.mjoint()
S3 issue.
Added a new badge to the README.
Add residual()
and fitted()
functions for mjoint
objects.
Added a plot function -- plot.ranef.mjoint()
-- for ranef.mjoint
objects.
bootSE()
now uses control parameters from fitted model and allows for individual parameter overwrite.
Added a check that any initial covariance matrix given is positive-definite.
Added a check that dimensions of any inits
given match the calculated dimensions from the model formulae.
Added a check that if multiple repeated longitudinal outcoems were given, that each subject contributes at least one measurement per outcome.
Changed the API so that postRE
and approx.se
arguments are now replaced by a single pfs
argument, which stands for post fit statistics. If TRUE
, then both the approximate SEs and the BLUPs (and SEs) are calculated and returned. This change is to facilitate other post fit statistics, e.g. residuals.
sampleData()
now allows for sampling without replacement.
plot()
and plotConvergence()
now have the option to discard burn-in phase iterations from the MCEM algorithm.
plot()
and plotConvergence()
now plot the log-likelihood trace.
vcov()
now calculates the variance-covariance matrix rather than inside mjoint()
and then extracting it. It also utilises the QR-decomposition inverse function and the Moore-Penrose matrix inverse, as in some cases the matrix was nearly singular.
hessian()
(and therefore vcov()
) now calculate the contribution for the random effect variance terms rather than the random effect precision (1 divided by the variance) terms. The correct contribution to the score for off-diagonal terms is now also impleted.
The left-hand side of the formLongFixed
now handles transformations.
vcov()
now returns the variance-covariance matrix as intended. Previously it was only returning the profile empirical information matrix.
Patched a major bug in gammaUpdate()
where ties in failure times were not being properly handled. The code for gammaUpdate_approx()
always worked fine, as it was based only on the score vector. This bug manifested when bootSE()
was called due to the resampling with replacement yielding datasets with many more ties than in the original dataset used to fit the model. To fix it, the information matrix required scaling at each failure time by the number of failures in the data. The formula for the information matrix in the Technical Details vignette has also been updated.
Patched a minor bug with prevented univariate random-intercept models for being fitted.
Patched a minor bug in plotting convergence traces.
Patched a minor bug with bootstrapping univariate joint models without passing the MLEs as the initial values to the mjoint()
call.
Renamed approxSE()
function to hessian()
.
Renamed control
argument earlyPhase
to burnin
.
Default settings for control
parameters updated based on practical experience.
Package now Depends
on survival
and nlme
rather than Imports
to allow require()
statements to be removed from code.
Prevented roxygen
from exporting all functions.
Fixed Imports following CRAN Checks of v0.1.1.
Minor documentation edits and corrections.
Minor code tidy-up with slight speed-ups and stabilisations where found.
LICENSE upgraded to GPL-3 to be compatible with joineR
v1.1.0.
Removed internal function fast_nearPD()
from package code as was unused.
Removed internal function EexpArma()
from package code as was unused.
Added unit tests.
Registered native C++ routines and disabled symbol search to satisfy CRAN CMD checks.