Machine Learning with AdaBoost on Decision Stumps

Creates classifier for binary outcomes using Adaptive Boosting (AdaBoost) algorithm on decision stumps with a fast C++ implementation. For a description of AdaBoost, see Freund and Schapire (1997) . This type of classifier is nonlinear, but easy to interpret and visualize. Feature vectors may be a combination of continuous (numeric) and categorical (string, factor) elements. Methods for classifier assessment, predictions, and cross-validation also included.


Machine learning package used to build and test classifiers using AdaBoost on decision stumps.

Creates classifier for binary outcomes using Adaptive Boosting (AdaBoost) on decision stumps with a fast C++ implementation. Feature vectors may be a combination of continuous (numeric) and categorical (string, factor) elements. Methods for classifier assessment, predictions, and cross-validation also included. The advantage of this type of classifier is that it is non-linear but it is more interpretable than random forests, neural-nets, and other non-linear classifiers.

See jadonwagstaff.github.io/sboost for a description of how the classifier functions, and what makes this classifier more interpretable than others.

For original paper describing AdaBoost see:

Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. Journal of Computer and System Sciences 55(1), 119-139 (1997)

Installation

Install this package from the CRAN repository.

install.packages("sboost")

Alternatively, use devtools to install the development version of this package.

To install devtools on R run:

install.packages("devtools")

After devtools is installed, to install the sboost package on R run:

devtools::install_github("jadonwagstaff/sboost")

Functions

sboost - Main machine learning algorithm, uses categorical or continuous features to build a classifier that predicts a binary outcome. Run ?sboost::sboost to see documentation in R.

validate - Uses k-fold cross validation on a training set to validate the classifier.

assess - Shows performance of a classifier on a set of feature vectors and outcomes.

predict - Outputs predictions of a classifier on a set of feature vectors.

Author

Jadon Wagstaff

Licence

MIT

News

sboost 0.1.1

Major Changes

  • Classifier output from sboost() now includes right_categories column which is similar to left_categories but is associated with the outcomes in right column. When assessing this new classifier, if a categorical input cannot be found in either right_categories or left_categories (i.e. was not found in training data) the vote for this feature will now be 0. (Before this, if an input was not found in left_categories, it was assumed that the input would be associated with the right outcome.)

  • There is a new optional parameter in the sboost() and validate() functions called verbose. The default value for verbose is FALSE, and there is no change from previous versions when verbose = FALSE. If verbose is set to TRUE, a progress bar will appear in the console for each classifier that is created.

Minor Changes

  • The Description of the package in the DESCRIPTION file now contains a reference to Freund and Schapire's paper on AdaBoost.

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("sboost")

0.1.1 by Jadon Wagstaff, 3 months ago


https://github.com/jadonwagstaff/sboost


Report a bug at https://github.com/jadonwagstaff/sboost/issues


Browse source code at https://github.com/cran/sboost


Authors: Jadon Wagstaff [aut, cre]


Documentation:   PDF Manual  


MIT + file LICENSE license


Imports dplyr, rlang, Rcpp, stats

Suggests testthat

Linking to Rcpp


See at CRAN