Regression Helper Functions

Methods for manipulating regression models and for describing these in a style adapted for medical journals. Contains functions for generating an HTML table with crude and adjusted estimates, plotting hazard ratio, plotting model estimates and confidence intervals using forest plots, extending this to comparing multiple models in a single forest plots. In addition to the descriptive methods, there are add-ons for the robust covariance matrix provided by the 'sandwich' package, a function for adding non-linearities to a model, and a wrapper around the 'Epi' package's Lexis() functions for time-splitting a dataset when modeling non-proportional hazards in Cox regressions.


CRAN_Status_Badge

This package helps with building and conveying regression models. It has also a few functions for handling robust confidence intervals for the ols() regression in the rms-package. It is closely interconnected with the the Gmisc, htmlTable, and the forestplot packages.

Communicating statistical results is in my opinion just as important as performing the statistics. Often effect sizes may seem technical to clinicians but putting them into context often helps and makes you to get your point across.

Crude and adjusted estimates

The method that I most frequently use in this package is the printCrudeAndAdjustedModel. It generates a table that has the full model coefficients together with confidence intervals alongside a crude version using only that single variable. This allows the user to quickly gain insight into how strong each variable is and how it interacts with the full model, e.g. a variable that shows a large change when adding the other variables suggests that there is significant confounding. See the vignette for more details, vignette("Print_crude_and_adjusted_models").

Forest plots for regression models

I also like to use forest plots for conveying regression models. A common alternative to tables is to use a forest plot with estimates and confidence intervals displayed in a graphical manner. The actual numbers of the model may be better suited for text while the graphs quickly tell how different estimates relate.

Sometimes we also have situations where one needs to choose between two models, e.g. a Poisson regression and a Cox regression. This package provides a forestplotCombineRegrObj function that allows you to simultaneously show two models and how they behave in different settings. This is also useful when performing sensitivity analyses and comparing different selection criteria, e.g. only selecting the patients with high-quality data and see how that compares.

Plotting non-linear hazard ratios

The plotHR function was my first attempt at doing something more advanced version based upon Reinhard Seifert's original adaptation of the stats::termplot function. It has some neat functionality although I must admit that I now often use ggplot2 for many of my plots as I like to have a consistent look throughout the plots. The function has though a neat way of displaying the density of the variable at the bottom.

Modeling helpers

Much of our modeling ends up a little repetitive and this package contains a set of functions that I've found useful. The approach that I have for modeling regressions is heavily influenced by Frank Harrell's regression modeling strategies. The core idea consist of:

  • Choose the variables that should be in the model (for this I often use DAG diagrams drawn with dagitty.net)
  • I build the basic model and then test the continuous variables for non-linearity using the addNonLinearity function. The function tests using ANOVA for non-linearity and if such is found it maps a set of knots, e.g. 2-7 knots, of a spline function and then checks for the model with the lowest AIC/BIC value. If it is the main variable I do this by hand to avoid choosing a too complex model when the AIC/BIC values are very similar but for confounders I've found this a handy approach.
  • I then check for interactions that I believe to exist using the ANOVA approach
  • Finally I see if there are any violations to the model assumptions. I often use linear regression for Health Related Quality of Life (HRQoL) scores and these are often plagued by problems in homoscedasticity and using robust variance-covariance matrices takes care of this issue with the robcov_alt method. In survival analyses the non-proportional hazards assumption can sometimes be violated where the timeSplitter function helps you to set-up a dataset that allows you to build time-interaction models (see vignette("timeSplitter") for details).

News

NEWS for the Greg package

Changes for 1.3

  • Fix vignette bug
  • Fixed CRAN package names
  • Fix bug due to change in survival::coxph

Changes for 1.2

  • Fixed spelling errors, travis package check, added tests for coverage and other things related to CRAN submission
  • Added a timeSplitter for making time splits through the Epi::Lexis object simpler
  • Added a check for null as input parameter to printCrudeAndAdjusted
  • Added a subset function for get/printCrudeAndAdjusted
  • Updated the test cases
  • The reference now has the same number of digits as the rest of the coefficients
  • Added htmlTable.printCrudeAndAdjusted for more flexible handling

Changes for 1.1

  • The printCrudeAndAdjusted calls now the correct print version
  • Added vignette for basic use cases
  • Fixed non-linearity
  • Added options-alternative to tailor printCrudeAdj output
  • Fixed issue #5
  • Fixed tspanner handling for printCrudeAndAdjustedModel
  • Bug when with descriptive stats due to missing misspelled (useNa instead of useNA)
  • Added a rbind for printCrudeAndAdjusted
  • Added handling of missing variable for the outcome estimator
  • Fixed simpleRmsAnova
  • Fixed getModelData with subsetting arguments not being constants
  • All show_missing are now useNA to comply with main package standards
  • A few bug fixes

Changes for 1.0.0

  • Changed the desc_ arguments into a list using the caDescribeOpts
  • Refactored major parts of the printCrudeAndAdjusted
  • Changed to use model.frame() under the hood - hopefully stabilizing the functions as it uses a more standard R-approach
  • The update() now actively runs under the model environment
  • The plotHR is updated and can now plot contrasts if provided a rms-package regression allowing for efficient comparison of multiple models. Internal functions have also been externalized.
  • All current unit tests now pass and several new ones have been added
  • Implemented DRY roxygen2 code
  • The getCrudeAndAdjusted now retains cluster and stratas - there are options for leaving these out to retain old behavior
  • The getCrudeAndAdjusted now allows for selecting variables. The printCrudeAndAdjusted now passes on the order option when variable selection is of interest.
  • Cleaned variables for the forestplot2 functions
  • Added the addNonlinearity that adds non-linearity for a variable if it finds support for it in the data.

Changes for 0.7.1

  • Internalized some of the private function documentation
  • Improved the outcome extractor function and added test cases

Changes for 0.7.0

  • Major remake of the print-/getCrudeAndAdjusted so that they depend on prMapVariable2Name and everything is now centered around the variables
  • The printCrudeAndAdjustedModel no longer capitalizes the first letter in order to allow all lower case var names
  • Added imputation compatibility with the Hmisc::fit.mult.impute function
  • Added multiple test cases for stability

Changes for 0.6.1

  • Changed versioning
  • Fixed bug for printCrudeAndAdjusted when using matrix from the getC&A
  • Fixed bug when boolean desc_column was generated
  • Added stop() when using descriptive column without add_refrerences

Changes for 0.6.0.1

  • The getCrudeAndAdjusted now handles the intercept term better for naming the rows (Thanks to Victor)
  • The getCrudeAndAdjusted/printCrudeAndAdjusted now use 'model' instead 'fit' for the model regression object name
  • The printCrudeAndAdjusted now calls print on the htmlTable_str object so that it appears as expected
  • Added unit tests

Changes for 0.6.0.0

  • The split from the Gmisc package

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("Greg")

1.3.2 by Max Gordon, 12 hours ago


http://gforge.se


Browse source code at https://github.com/cran/Greg


Authors: Max Gordon [aut, cre] , Reinhard Seifert [aut] (Author of original plotHR)


Documentation:   PDF Manual  


GPL (>= 3) license


Imports Hmisc, stringr, rms, sandwich, stats, nlme, methods, htmlTable, magrittr, knitr, Epi, utils, graphics, grDevices

Depends on forestplot, Gmisc

Suggests boot, testthat, cmprsk, survival, dplyr, ggplot2, parallel, rmarkdown


See at CRAN