Machine Learning Models and Tools

Meta-package for statistical and machine learning with a common interface for model fitting, prediction, performance assessment, and presentation of results. Supports predictive modeling of numerical, categorical, and censored time-to-event outcomes and resample (bootstrap and cross-validation) estimation of model performance.


MachineShop: Machine Learning Models and Tools

Overview

MachineShop is a meta-package for statistical and machine learning with a common interface for model fitting, prediction, performance assessment, and presentation of results. Support is provided for predictive modeling of numerical, categorical, and censored time-to-event outcomes, including those listed in the table below, and for resample (bootstrap and cross-validation) estimation of model performance.

Response Variable Types
factor numeric ordered Surv
C5.0 Classification x
Conditional Inference Trees x x x
Cox Regression x
Generalized Linear Models x x
Gradient Boosted Models x x x
Lasso and Elastic-Net x x x
Feed-Forward Neural Networks x x
Partial Least Squares x x
Ordered Logistic Regression x
Random Forests x x
Parametric Survival Regression x
Support Vector Machines x x

Installation

# install.packages("devtools")
devtools::install_github("brian-j-smith/MachineShop")
 
# Development version with vignettes
devtools::install_github("brian-j-smith/MachineShop", build_vignettes = TRUE)

Example

The following is a brief example using the package to apply gradient boosted models to predict the species of flowers in Edgar Anderson's iris dataset.

## Load the package
library(MachineShop)
library(magrittr)
 
## Iris flower species (3 level response)
df <- iris
df$Species <- factor(df$Species)
 
## Create training and test sets
set.seed(123)
trainindices <- sample(nrow(df), nrow(df) * 2 / 3)
train <- df[trainindices, ]
test <- df[-trainindices, ]
 
## Gradient boosted mode fit to training set
gbmfit <- fit(Species ~ ., data = train, model = GBMModel)
 
## Variable importance
(vi <- varimp(gbmfit))
#>                  Overall
#> Petal.Length 100.0000000
#> Petal.Width   12.9638575
#> Sepal.Width    0.1409401
#> Sepal.Length   0.0000000
 
plot(vi)
## Test set predicted probabilities
predict(gbmfit, newdata = test, type = "prob") %>% head
#>         setosa   versicolor    virginica
#> [1,] 0.9999755 2.449128e-05 2.828117e-08
#> [2,] 0.9999365 6.346918e-05 6.535304e-09
#> [3,] 0.9999365 6.346918e-05 6.535304e-09
#> [4,] 0.9999755 2.449128e-05 2.828117e-08
#> [5,] 0.9998941 1.059313e-04 8.577135e-09
#> [6,] 0.9999291 7.084465e-05 5.736212e-09
 
## Test set predicted classification
predict(gbmfit, newdata = test) %>% head
#> [1] setosa setosa setosa setosa setosa setosa
#> Levels: setosa versicolor virginica
## Resample estimation of model performance
(perf <- resample(Species ~ ., data = df, model = GBMModel))
#> An object of class "Resamples"
#> 
#> metrics: Accuracy, Kappa, MLogLoss
#> 
#> method: 10-Fold CV
#> 
#> resamples: 10
 
summary(perf)
#>               Mean    Median         SD          Min       Max NA
#> Accuracy 0.9400000 0.9333333 0.04919099 0.8666666667 1.0000000  0
#> Kappa    0.9100000 0.9000000 0.07378648 0.8000000000 1.0000000  0
#> MLogLoss 0.2624184 0.2301317 0.23719156 0.0008299047 0.5474997  0
 
plot(perf)
## Model tuning
gbmtune <- tune(Species ~ ., data = df, model = GBMModel,
                grid = expand.grid(n.trees = c(25, 50, 100),
                                   interaction.depth = 1:3,
                                   n.minobsinnode = c(5, 10)))
 
plot(gbmtune, type = "line")

Documentation

Once the package is installed, general documentation on its usage can be viewed with the following console commands.

library(MachineShop)
 
# Package help summary
?MachineShop
 
# Vignette
RShowDoc("Introduction", package = "MachineShop")

News

News

Version Updates

0.1

  • Initial public release

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.