Deep Forest Model

R application programming interface (API) for Deep Forest which based on Zhou and Feng (2017). Deep Forest: Towards an Alternative to Deep Neural Networks. () or Zhou and Feng (2017). Deep Forest. (). And for the Python module 'gcForest' (< https://github.com/pylablanche/gcForest>).


Xu Jing

1.Introduction

The gcForest algorithm was suggested in Zhou and Feng 2017 (https://arxiv.org/abs/1702.08835 , refer for this paper for technical details) and pylablanche(https://github.com/pylablanche) who provide a Python3.X implementation of this algorithm on github (https://github.com/pylablanche/gcForest). We provide a R package called gcForest which is the R interface of the pylablanche's gcForest module (Python3.X). And if you want to known more about gcForest, please read the source paper (Deep Forest).

2.Prerequisites

As mentioned in the previous paragraph, we provide a R interface of the Python3.X module gcForest(by pylablanche), If you want to use this R packages like the R's tensorflow and keras you should hava a Python3.X environment first, and then, you will need to have the following installed on your computer to make it work:

  • Python 3.X
  • Numpy >= 1.12.0
  • Scikit-learn >= 0.18.1

after these, you can install this R package like:

  • install.packages('gcForest')

  • devtools::install_github('DataXujing/gcForest_r')

run library(gcForest) in your R Console no mistakes.

3.Using gcForest

Supported APIs:

  • fit(X,y) Training the gcForest on input data X and associated target y;
  • predict(X) Predict the class of unknown samples X;
  • predict_proba(X) Predict the class probabilities of unknown samples X;
  • mg_scanning(X, y=None) Performs a Multi Grain Scanning on input data;
  • window_slicing_pred_prob(X, window, shape_1X, y=None) Performs a window slicing of the input data and send them through Random Forests. If target values 'y' are provided sliced data are then used to train the Random Forests;
  • cascade_forest(X, y=None) Perform (or train if 'y' is not None) a cascade forest estimator;
  • gcdata(x) Tansform R data structure to Python data structure;
  • model_save(model,path) To save trained model in disk;
  • model_load(path) To load trained model from disk to R environment;

Example1: iris data set

 
library(gcForest)
 
sk <- reticulate::import('sklearn')
train_test_split <- sk$model_selection$train_test_split
 
data <- sk$datasets$load_iris
iris <- data()
X = iris$data
y = iris$target
data_split = train_test_split(X, y, test_size=0.33)
 
X_tr <- data_split[[1]]
X_te <- data_split[[2]]
y_tr <- data_split[[3]]
y_te <- data_split[[4]]
 
gcforest_m <- gcforest(shape_1X=4L, window=2L, tolerance=0.0)
gcforest_m$fit(X_tr,y_tr)
gcf_model <- model_save(gcforest_m,'../gcforest_model.model')
 
gcf <- model_load('../gcforest_model.model')
gcf$fit(X_tr, y_tr)
 

Example2: Digits data set

 
library(gcForest)
 
sk <- sk <- reticulate::import('sklearn')
train_test_split <- sk$model_selection$train_test_split
 
data <- sk$datasets$load_digits
digits <- data()
X = digits$data
y = digits$target
data_split = train_test_split(X, y, test_size=0.4)
 
gcforest_m <- gcforest(shape_1X=c(8L,8L), window=c(4L,6L), tolerance=0.0, min_samples_mgs=10L, min_samples_cascade=7L)
gcforest_m$fit(X_tr,y_tr)
gcf_model <- model_save(gcforest_m,'../gcforest_model.model')
 
gcf <- model_load('../gcforest_model.model')
gcf$fit(X_tr, y_tr)

Example3: Using mg-scanning and cascade_forest Sperately

# mg-scanning
gcforest_m <- gcForest(shape_1X=c(8L,8L), window=5L, min_samples_mgs=10L, min_samples_cascade=7L)
X_tr_mgs <- gcforest_m$mg_scanning(X_tr, y_tr)
 
X_te_mgs <- gcforest_m$mg_scanning(X_te)
 
# cascade_forest
gcforest_m <- gcForest(tolerance=0.0, min_samples_mgs=10L, min_samples_cascade=7L)
cf <- gcforest_m$cascade_forest(X_tr_mgs, y_tr)
 
pred_proba <- gcforest_m$cascade_forest(X_te_mgs)
pred_proba <- reticulate::py_to_r(pred_proba)
 
# then do mean and max

Example4: Skipping mg_scanning

gcforest_m <- gcForest(tolerance=0.0, min_samples_cascade=20L)
cf <- gcforest_m$cascade_forest(X_tr, y_tr)
pred_proba <- gcforest_m$cascade_forest(X_te)
pred_proba <- reticulate::py_to_r(pred_proba)
 
# then do mean and max
 

4.Notes

Thanks for the paper of Deep Forest( Zhou and Feng 2017 (https://arxiv.org/abs/1702.08835)) and the author of the gcForest Python3.X moulde (pylablanche https://github.com/pylablanche). And We constantly improve gcForest R package, and even consider putting official modules which provide by LAMDA(Learning And Mining from DatA)in gcForest R package.

News

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("gcForest")

0.2.7 by Xu Jing, 5 months ago


https://github.com/DataXujing/gcForest_r


Report a bug at https://github.com/DataXujing/gcForest_r/issues


Browse source code at https://github.com/cran/gcForest


Authors: Xu Jing [cre]


Documentation:   PDF Manual  


GPL (>= 2) license


Imports reticulate, pkgdown, crayon, cli, utils

Suggests rmarkdown, knitr

System requirements: Python (>= 3.5.0)


See at CRAN