Fuzzy forests, a new algorithm based on random forests, is designed to reduce the bias seen in random forest feature selection caused by the presence of correlated features. Fuzzy forests uses recursive feature elimination random forests to select features from separate blocks of correlated features where the correlation within each block of features is high and the correlation between blocks of features is low. One final random forest is fit using the surviving features. This package fits random forests using the 'randomForest' package and allows for easy use of 'WGCNA' to split features into distinct blocks.
fuzzyforest
is an extension of random forests designed to yield less biased
variable importance rankings when features are correlated with one another.
The algorithm requires that features be partitioned into seperate groups
or modules such that the correlation within groups are large and the
correlation between groups is small. fuzzyforest
allows for easy integration
the package WGCNA
.
install.packages("fuzzyforest")
To enable use of the full functionality of fuzzyforest
packages WGCNA
must be installed. However, WGCNA
requires the installation of a few
packages form bioConductor. To install WGCNA
, type the following lines
into the console:
setRepositories(ind=1:2) install.packages("WGCNA")source("http://bioconductor.org/biocLite.R")biocLite("AnnotationDbi", type="source")biocLite("GO.db")
If further issues with the installation of WGCNA
arise see the WGCNA
website: http://labs.genetics.ucla.edu/horvath/CoexpressionNetwork/Rpackages/WGCNA/index.html#manualInstall
This work is partially supported through NSF grant IIS 1251151.