Fitting Dynamic Frailty Models with the EM Algorithm

Fits semiparametric dynamic frailty models according to the methodology of Putter and van Houwelingen (2015) . Intermediate models, where the frailty is piecewise constant on prespecified intervals, are also supported. The frailty process is taken to have a specific auto-correlation structure, and the supported distributions include gamma, inverse Gaussian, power variance family (PVF) and positive stable.

This is an R package for fitting semiparametric dynamic frailty models with the EM algorithm. The hazard for individual j from cluster i is specified as:
λ_{i**j}(t|Z_{i}(t)) = Z_{i}(t)exp(β^{⊤}x_{i**j}(t))λ_{0}(t).
The model used here is described in detail in Putter & van Houwelingen (2015). The distribution of Z_{i}(t) is described by two parameters: θ, that is an inverse-variability parameter of Z_{i}(t) for a fixed t, and λ, that describes the autocorrelation of the process, so that for t_{1} ≤ t_{2}cor(Z_{i}(t_{1}),Z_{i}(t_{2})) = exp(λ(t_{2} − t_{1})).

The estimation process is that for fixed (θ, λ) the maximized profile likelihood is calculated, i.e. maximized with respect to (β, λ_{0}). This profile likelihood is finally maximized itself.

Installation

The development version from GitHub:

devtools::install_github("tbalan/dynfrail")

The following packages are needed to build dynfrail:

semiparametric Z(t) that changes values at every t or piecewise constant Z(t)

clustered survival data & recurrent events (calendar time or gaptime) ar supported

Functions

dynfrail() has a friendly syntax very similar to the frailtyEM package: next to a formula and data argument, the distribution argument is used to specify the distribution parameters and the control parameter is used for controling the precision of the estimation.

dynfrail_prep() and dynfrail_fit() are used internally by dynfrail() but are made user-available. The first one prepares the input of dynfrail() to make it suitable for the actual EM algorithm. The second one performs one EM algorithm for fixed (θ, λ) to estimate the maximum (β, λ_{0}).

Limitations

slow even for medium sized data sets. It is recommended to start with a small number of piecewise constant intervals and/or a subset of the data