A Fast Implementation of Random Forests

A fast implementation of Random Forests, particularly suited for high dimensional data. Ensembles of classification, regression, survival and probability prediction trees are supported. Data from genome-wide association studies can be analyzed efficiently. In addition to data frames, datasets of class 'gwaa.data' (R package 'GenABEL') and 'dgCMatrix' (R package 'Matrix') can be directly analyzed.


Version 0.11.2
  • Bug fixes
Version 0.11.1
  • Bug fixes
Version 0.11.0
  • Add max.depth parameter to limit tree depth
  • Add inbag argument for manual selection of observations in trees
  • Add support of splitting weights for corrected impurity importance
  • Internal changes (slightly improved computation speed)
  • Warning: Possible seed differences compared to older versions
  • Bug fixes
Version 0.10.0
  • Change license of C++ core to MIT (R package is still GPL3)
  • Better 'order' mode for unordered factors for multiclass and survival
  • Add 'order' mode for unordered factors for GenABEL SNP data (binary classification and regression)
  • Add class-weighted Gini splitting
  • Add fixed proportion sampling
  • Add impurity importance for the maxstat splitting rule
  • Remove GenABEL from suggested packages (removed from CRAN). GenABEL data is still supported
  • Improve memory management (internal changes)
  • Bug fixes
Version 0.9.0
  • Add bias-corrected impurity importance (actual impurity reduction, AIR)
  • Add quantile prediction as in quantile regression forests
  • Add treeInfo() function to extract human readable tree structure
  • Add standard error estimation with the infinitesimal jackknife (now the default)
  • Add impurity importance for survival forests
  • Faster aggregation of predictions
  • Fix memory issues on Windows 7
  • Bug fixes
Version 0.8.0
  • Handle sparse data of class Matrix::dgCMatrix
  • Add prediction of standard errors to predict()
  • Allow devtools::install_github() without subdir and on Windows
  • Bug fixes
Version 0.7.0
  • Add randomized splitting (extraTrees)
  • Better formula interface: Support interactions terms and faster computation
  • Split at mid-point between candidate values
  • Improvements in holdoutRF and importance p-value estimation
  • Drop unused factor levels in outcome before growing
  • Add predict.all for probability and survival prediction
  • Bug fixes
Version 0.6.0
  • Set write.forest=TRUE by default
  • Add num.trees option to predict()
  • Faster version of getTerminalNodeIDs(), included in predict()
  • Handle new factor levels in 'order' mode
  • Use unadjusted p-value for 2 categories in maxstat splitting
  • Bug fixes
Version 0.5.0
  • Add Windows multithreading support for new toolchain
  • Add splitting by maximally selected rank statistics for survival and regression forests
  • Faster method for unordered factor splitting
  • Add p-values for variable importance
  • Runtime improvement for regression forests on classification data
  • Bug fixes
Version 0.4.0
  • Reduce memory usage of savest forest objects (changed child.nodeIDs interface)
  • Add keep.inbag option to track in-bag counts
  • Add option sample.fraction for fraction of sampled observations
  • Add tree-wise split.select.weights
  • Add predict.all option in predict() to get individual predictions for each tree for classification and regression
  • Add case-specific random forests
  • Add case weights (weighted bootstrapping or subsampling)
  • Remove tuning functions, please use mlr or caret
  • Catch error of outdated gcc not supporting C++11 completely
  • Bug fixes
Version 0.3.0
  • Allow the user to interrupt computation from R
  • Transpose classification.table and rename to confusion.matrix
  • Respect R seed for prediction
  • Memory improvements for variable importance computation
  • Fix bug: Probability prediction for single observations
  • Fix bug: Results not identical when using alternative interface
Version 0.2.7
  • Small fixes for Solaris compiler
Version 0.2.6
  • Add C-index splitting
  • Fix NA SNP handling
Version 0.2.5
  • Fix matrix and gwaa alternative survival interface
  • Version submitted to JSS
Version 0.2.4
  • Small changes in documentation
Version 0.2.3
  • Preallocate memory for splitting
Version 0.2.2
  • Remove recursive splitting
Version 0.2.1
  • Allow matrix as input data in R version
Version 0.2.0
  • Fix prediction of classification forests in R
Version 0.1.9
  • Speedup growing for continuous covariates
  • Add memory save option to save memory for very large datasets (but slower)
  • Remove memory mode option from R version since no performance gain
Version 0.1.8
  • Fix problems when using Rcpp <0.11.4
Version 0.1.7
  • Add option to split on unordered categorical covariates
Version 0.1.6
  • Optimize memory management for very large survival forests
Version 0.1.5
  • Set required Rcpp version to 0.11.2
  • Fix large $call objects when using BatchJobs
  • Add details and example on GenABEL usage to documentation
  • Minor changes to documentation
Version 0.1.4
  • Speedup for survival forests with continuous covariates
  • R version: Generate seed from R. It is no longer necessary to set the seed argument in ranger calls.
Version 0.1.3
  • Windows support for R version (without multithreading)
Version 0.1.2
  • Speedup growing of regression and probability prediction forests
  • Prediction forests are now handled like regression forests: MSE used for prediction error and permutation importance
  • Fixed name conflict with randomForest package for "importance"
  • Fixed a bug: prediction function is now working for probability prediction forests
  • Slot "predictions" for probability forests now contains class probabilities
  • importance function is now working even if randomForest package is loaded after ranger
  • Fixed a bug: Split selection weights are now working as expected
  • Small changes in documentation

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.