Calculate power for the 'time x treatment' effect
in two- and three-level multilevel longitudinal studies with missing data.
Both the third-level factor (e.g. therapists, schools, or physicians),
and the second-level factor (e.g. subjects), can be assigned random slopes.
Studies with partially nested designs, unequal cluster sizes,
unequal allocation to treatment arms, and different dropout patterns
per treatment are supported. For all designs power can be
calculated both analytically and via simulations. The analytical
calculations extends the method described in Galbraith et al. (2002)
Power Analysis for Longitudinal Multilevel/Linear Mixed-Effects Models.
The purpose of powerlmm
is to help design longitudinal treatment studies (parallel groups), with or without higher-level clustering (e.g. longitudinally clustered by therapists, groups, or physician), and missing data. The main features of the package are:
powerlmm
can be installed from CRAN and GitHub.
install.packages("powerlmm")# GitHub, dev versiondevtools::install_github("rpsychologist/powerlmm")
This is an example of setting up a three-level longitudinal model with random slopes at both the subject- and cluster-level, with different missing data patterns per treatment arm. Relative standardized inputs are used, but unstandardized raw parameters values can also be used.
library(powerlmm)d <- per_treatment(control = dropout_weibull(0.3, 2),treatment = dropout_weibull(0.2, 2))p <- study_parameters(n1 = 11,n2 = 10,n3 = 5,icc_pre_subject = 0.5,icc_pre_cluster = 0,icc_slope = 0.05,var_ratio = 0.02,dropout = d,effect_size = cohend(-0.8,standardizer = "pretest_SD"))p#>#> Study setup (three-level)#>#> n1 = 11#> n2 = 10 x 5 (treatment)#> 10 x 5 (control)#> n3 = 5 (treatment)#> 5 (control)#> 10 (total)#> total_n = 50 (treatment)#> 50 (control)#> 100 (total)#> dropout = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 (time)#> 0, 0, 1, 3, 6, 9, 12, 16, 20, 25, 30 (%, control)#> 0, 0, 1, 2, 4, 5, 8, 10, 13, 17, 20 (%, treatment)#> icc_pre_subjects = 0.5#> icc_pre_clusters = 0#> icc_slope = 0.05#> var_ratio = 0.02#> effect_size = -0.8 (Cohen's d [SD: pretest_SD])
plot(p)
get_power(p, df = "satterthwaite")#>#> Power Analyis for Longitudinal Linear Mixed-Effects Models (three-level)#> with missing data and unbalanced designs#>#> n1 = 11#> n2 = 10 x 5 (treatment)#> 10 x 5 (control)#> n3 = 5 (treatment)#> 5 (control)#> 10 (total)#> total_n = 50 (control)#> 50 (treatment)#> 100 (total)#> dropout = 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 (time)#> 0, 0, 1, 3, 6, 9, 12, 16, 20, 25, 30 (%, control)#> 0, 0, 1, 2, 4, 5, 8, 10, 13, 17, 20 (%, treatment)#> icc_pre_subjects = 0.5#> icc_pre_clusters = 0#> icc_slope = 0.05#> var_ratio = 0.02#> effect_size = -0.8 (Cohen's d [SD: pretest_SD])#> df = 7.953218#> alpha = 0.05#> power = 68%
Unequal cluster sizes is also supported, the cluster sizes can either be random (sampled), or the marginal distribution can be specified.
p <- study_parameters(n1 = 11,n2 = unequal_clusters(2, 3, 5, 20),icc_pre_subject = 0.5,icc_pre_cluster = 0,icc_slope = 0.05,var_ratio = 0.02,effect_size = cohend(-0.8,standardizer = "pretest_SD"))get_power(p)#>#> Power Analyis for Longitudinal Linear Mixed-Effects Models (three-level)#> with missing data and unbalanced designs#>#> n1 = 11#> n2 = 2, 3, 5, 20 (treatment)#> 2, 3, 5, 20 (control)#> n3 = 4 (treatment)#> 4 (control)#> 8 (total)#> total_n = 30 (control)#> 30 (treatment)#> 60 (total)#> dropout = No missing data#> icc_pre_subjects = 0.5#> icc_pre_clusters = 0#> icc_slope = 0.05#> var_ratio = 0.02#> effect_size = -0.8 (Cohen's d [SD: pretest_SD])#> df = 6#> alpha = 0.05#> power = 44%
Cluster sizes follow a Poisson distribution
p <- study_parameters(n1 = 11,n2 = unequal_clusters(func = rpois(5, 5)), # sample from Poissonicc_pre_subject = 0.5,icc_pre_cluster = 0,icc_slope = 0.05,var_ratio = 0.02,effect_size = cohend(-0.8,standardizer = "pretest_SD"))get_power(p, R = 100, progress = FALSE) # expected power by averaging over R realizations#>#> Power Analyis for Longitudinal Linear Mixed-Effects Models (three-level)#> with missing data and unbalanced designs#>#> n1 = 11#> n2 = rpois(5, 5) (treatment)#> - (control)#> n3 = 5 (treatment)#> 5 (control)#> 10 (total)#> total_n = 26 (control)#> 26 (treatment)#> 52 (total)#> dropout = No missing data#> icc_pre_subjects = 0.5#> icc_pre_clusters = 0#> icc_slope = 0.05#> var_ratio = 0.02#> effect_size = -0.8 (Cohen's d [SD: pretest_SD])#> df = 8#> alpha = 0.05#> power = 49% (MCSE: 1%)#>#> NOTE: n2 is randomly sampled. Values are the mean from R = 100 realizations.
Several convenience functions are also included, e.g. for creating power curves.
x <- get_power_table(p,n2 = 5:10,n3 = c(4, 8, 12),effect_size = cohend(c(0.5, 0.8), standardizer = "pretest_SD"))
plot(x)
The package includes a flexible simulation method that makes it easy to investigate the performance of different models. As an example, let's compare the power difference between the 2-level LMM with 11 repeated measures, to doing an ANCOVA at posttest. Using sim_formula
different models can be fit to the same data set during the simulation.
p <- study_parameters(n1 = 11,n2 = 40,icc_pre_subject = 0.5,cor_subject = -0.4,var_ratio = 0.02,effect_size = cohend(-0.8,standardizer = "pretest_SD"))# 2-lvl LMMf0 <- sim_formula("y ~ time + time:treatment + (1 + time | subject)")# ANCOVA, formulas with no random effects are with using lm()f1 <- sim_formula("y ~ treatment + pretest",data_transform = transform_to_posttest,test = "treatment")f <- sim_formula_compare("LMM" = f0,"ANCOVA" = f1)res <- simulate(p,nsim = 2000,formula = f,cores = parallel::detectCores(logical = FALSE))
We then summarize the results using summary
. Let's look specifically at the treatment effects.
summary(res, para = list("LMM" = "time:treatment","ANCOVA" = "treatment"))#> Model: summary#>#> Fixed effects: 'time:treatment', 'treatment'#>#> model M_est theta M_se SD_est Power Power_bw Power_satt#> LMM -1.1 -1.1 0.32 0.31 0.94 0.93 .#> ANCOVA -11.0 0.0 3.70 3.70 0.85 0.85 0.85#> ---#> Number of simulations: 2000 | alpha: 0.05#> Time points (n1): 11#> Subjects per cluster (n2 x n3): 40 (treatment)#> 40 (control)#> Total number of subjects: 80#> ---#> At least one of the models applied a data transformation during simulation,#> summaries that depend on the true parameter values will no longer be correct,#> see 'help(summary.plcp_sim)'
We can also look at a specific model, here's the results for the 2-lvl LMM.
summary(res, model = "LMM")#> Model: LMM#>#> Random effects#>#> parameter M_est theta est_rel_bias prop_zero is_NA#> subject_intercept 100.00 100.0 0.00600 0 0#> subject_slope 2.00 2.0 0.00630 0 0#> error 100.00 100.0 -0.00049 0 0#> cor_subject -0.39 -0.4 -0.01400 0 0#>#> Fixed effects#>#> parameter M_est theta M_se SD_est Power Power_bw Power_satt#> (Intercept) 0.0150 0.0 1.30 1.30 0.046 . .#> time 0.0013 0.0 0.25 0.25 0.055 . .#> time:treatment -1.1000 -1.1 0.32 0.31 0.940 0.93 .#> ---#> Number of simulations: 2000 | alpha: 0.05#> Time points (n1): 11#> Subjects per cluster (n2 x n3): 40 (treatment)#> 40 (control)#> Total number of subjects: 80#> ---#> At least one of the models applied a data transformation during simulation,#> summaries that depend on the true parameter values will no longer be correct,#> see 'help(summary.plcp_sim)'
The package's basic functionality is also implemented in a Shiny web application, aimed at users that are less familiar with R. Launch the application by typing
library(powerlmm)shiny_powerlmm()
This version substantially improves the simulate
method.
sim_formula(..., data_transform = func)
.
As an example transform_to_posttest
is included.sim_formula(..., test = "treatment")
lm()
. If this is combined with the transform_to_posttest
the
longitudinal model can be compared to a cross-sectional model, e.g. ANCOVA.summary.plcp_sim(..., model_selection = "FW", LRT_alpha = 0.25)
.simulate(formula = x)
must now be created using the new functions sim_formula
, or
sim_formula_compare
, and can no longer be a named list or a character vector.summary.plcp_sim()
now show fixed effect theta
s in the correct order, thanks to
GitHub user Johnzav888 (#10).effect_size = 5
,effect_size = cohend(0.5, "posttest_sd")
Simulate.plcp
will now automatically create lme4 formulas if none is
supplied, see ?create_lmer_formula
.uneqal_clusters
now accepts
a function indicating the distribution of cluster sizes, via the new argument
func
, e.g. rpois
or rnorm
could be used to draw cluster sizes.R
.parallel::makeCluster
(PSOCK). Forking is still used for
non-interactive Unix environments.plcp_multi
-objects.cores
.simulate.plcp_multi
now have more options for saving intermediate results.print.plcp_multi_power
now has better support for subsetting via either [],
head(), or subset().icc_pre_subject
is now defined as (u_0^2 + v_0^2) / (u_0^2 + v_0^2 + error^2)
,
instead of (u_0^2) / (u_0^2 + v_0^2 + error^2)
. This would be the subject-level ICC,
if there's no random slopes, i.e. correlation between time points for the same subject.study_parameters()
: 0 and NA now means different things. If 0 is passed, the parameters
is kept in the model, if you want to remove it specify it as NA instead.study_parameters()
: is now less flexible, but more robust. Previously a large
combination if raw and relative parameters could be combined, and the individual
parameters was solved for. To make the function less bug prone and easier to maintain,
it is now only possible to specify the cluster-level variance components as relative values,
if the other parameters as passed as raw inputs.simulate_data()
now includes a column y_c
that contains the full outcome vector,
without missing values added. This makes it easy to compare the complete and incomplete
data set, e.g. via simulate()
.simulate()
new argument batch_progress
enables showing progress when doing
multiple simulations.summary.plcp_sim
where the wrong % convergence was calculated.study_parameters
when
icc_cluster_pre = NULL
and all inputs are standardized.var_ratio
argument was
passed a vector of values including a 0, e.g. var_ratio = c(0, 0.1, 0.2)
.res[[1]]$paras
, and thus the single simulation would not print
correctly.cluster_intercept
and cluster_slope
is now correctly extracted from
(0 + treatment + treatment:time || cluster)
.Power.plcp_multi
is now exported.get_power.plcp_multi
now shows a progress bar.deterministic_dropout = FALSE
.First release.