Add Formula Interfaces to Modelling Functions

Automatically generates wrappers for modelling functions that accept data as a data matrix X and a data vector y and produces a wrapper that allows users to specify input data with a formula and a data frame. In addition to generating formula interfaces, users may also generated wrapper S3 generics.


formulize

Travis build status AppVeyor Build Status Coverage status

If you:

  • like using formulas, recipes and data frames to specify design matrices
  • develop nervous ticks when you come across modelling packages that only offer matrix/vector interfaces
  • don't have the time or motivation to write a formula wrapper around these interfaces
  • like untested and hacky software written by amateurs

then formulize may be for you. Formulize is very new, but you can still install formulize from github with:

devtools::install_github("alexpghayes/formulize")

Adding a formula or recipe interface

Suppose you want to add a formula interface to an existing modelling function, say cv.glmnet. Then you could do the following

library(recipes)
library(glmnet)
library(formulize)
 
glmnet_cv <- formulize(cv.glmnet)
 
glmnet_model <- glmnet_cv(mpg ~ drat + hp - 1, mtcars)
predict(glmnet_model, head(mtcars))
#>                          1
#> Mazda RX4         22.35385
#> Mazda RX4 Wag     22.35385
#> Datsun 710        22.85056
#> Hornet 4 Drive    19.97909
#> Hornet Sportabout 17.72895
#> Valiant           19.24104

Similarly glmnet_cv works with recipe objects like so

rec <- recipe(mpg ~ drat + hp, data = mtcars)
 
glmnet_model2 <- glmnet_cv(rec, mtcars)
predict(glmnet_model2, head(mtcars))
#>             1
#> [1,] 22.35392
#> [2,] 22.35392
#> [3,] 22.85062
#> [4,] 19.97897
#> [5,] 17.72884
#> [6,] 19.24084

You may also be interested in the more dangerous exciting version genericize, which you should call for its side effects.

genericize(cv.glmnet)
 
form <- mpg ~ drat + hp - 1
X <- model.matrix(form, mtcars)
y <- mtcars$mpg
 
set.seed(27)
mat_model <- cv.glmnet(X, y, intercept = TRUE)
 
set.seed(27)
frm_model <- cv.glmnet(form, mtcars, intercept = TRUE)
 
set.seed(27)
rec_model <- cv.glmnet(rec, mtcars, intercept = TRUE)
 
predict(mat_model, head(X))
#>                          1
#> Mazda RX4         22.25028
#> Mazda RX4 Wag     22.25028
#> Datsun 710        22.73249
#> Hornet 4 Drive    20.01959
#> Hornet Sportabout 17.84620
#> Valiant           19.33092
predict(frm_model, head(mtcars))
#>                          1
#> Mazda RX4         22.25028
#> Mazda RX4 Wag     22.25028
#> Datsun 710        22.73249
#> Hornet 4 Drive    20.01959
#> Hornet Sportabout 17.84620
#> Valiant           19.33092
predict(rec_model, head(mtcars))
#>             1
#> [1,] 22.25035
#> [2,] 22.25035
#> [3,] 22.73255
#> [4,] 20.01946
#> [5,] 17.84608
#> [6,] 19.33070

This creates a new S3 generic cv.glmnet, sets the provided function as the default method (cv.glmnet.default), and adds methods cv.glmnet.formula and cv.glmnet.recipe using formulize.

This will mask cv.glmnet and features no safety checks because safety isn't fun.

Caveats

  • formulize doesn't do anything special with intercepts. This means that you need to careful with functions that require you to specify intercepts in non-standard ways, such as cv.glmnet above.
  • If the original modelling function doesn't return a list, formulize will probably break.
  • If you're just looking for a formula interface to glmnet, take a look at glmnetUtils.

News

formulize 0.1.0

  • Added formulize and genericize functions
  • Added a NEWS.md file to track changes to the package.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("formulize")

0.1.0 by Alex Hayes, a year ago


Browse source code at https://github.com/cran/formulize


Authors: Alex Hayes [aut, cre]


Documentation:   PDF Manual  


MIT + file LICENSE license


Imports rlang, recipes

Suggests covr, glmnet, testthat


See at CRAN