Examples: visualization, C++, networks, data cleaning, html widgets, ropensci.

Found 1641 packages in 0.05 seconds

RFlocalfdr — by Robert Dunne, a year ago

Significance Level for Random Forest Impurity Importance Scores

Sets a significance level for Random Forest MDI (Mean Decrease in Impurity, Gini or sum of squares) variable importance scores, using an empirical Bayes approach. See Dunne et al. (2022) .

roseRF — by Elliot H. Young, a month ago

ROSE Random Forests for Robust Semiparametric Efficient Estimation

ROSE (RObust Semiparametric Efficient) random forests for robust semiparametric efficient estimation in partially parametric models (containing generalised partially linear models). Details can be found in the paper by Young and Shah (2024) .

Sstack — by Kevin Matlock, 7 years ago

Bootstrap Stacking of Random Forest Models for Heterogeneous Data

Generates and predicts a set of linearly stacked Random Forest models using bootstrap sampling. Individual datasets may be heterogeneous (not all samples have full sets of features). Contains support for parallelization but the user should register their cores before running. This is an extension of the method found in Matlock (2018) .

randomForestExplainer — by Yue Jiang, 4 years ago

Explaining and Visualizing Random Forests in Terms of Variable Importance

A set of tools to help explain which variables are most important in a random forests. Various variable importance measures are calculated and visualized in different settings in order to get an idea on how their importance changes depending on our criteria (Hemant Ishwaran and Udaya B. Kogalur and Eiran Z. Gorodeski and Andy J. Minn and Michael S. Lauer (2010) , Leo Breiman (2001) ).

SAEforest — by Patrick Krennmair, 2 years ago

Mixed Effect Random Forests for Small Area Estimation

Mixed Effects Random Forests (MERFs) are a data-driven, nonparametric alternative to current methods of Small Area Estimation (SAE). 'SAEforest' provides functions for the estimation of regionally disaggregated linear and nonlinear indicators using survey sample data. Included procedures facilitate the estimation of domain-level economic and inequality metrics and assess associated uncertainty. Emphasis lies on straightforward interpretation and visualization of results. From a methodological perspective, the package builds on approaches discussed in Krennmair and Schmid (2022) and Krennmair et al. (2022) .

metaforest — by Caspar J. van Lissa, 10 months ago

Exploring Heterogeneity in Meta-Analysis using Random Forests

Conduct random forests-based meta-analysis, obtain partial dependence plots for metaforest and classic meta-analyses, and cross-validate and tune metaforest- and classic meta-analyses in conjunction with the caret package. A requirement of classic meta-analysis is that the studies being aggregated are conceptually similar, and ideally, close replications. However, in many fields, there is substantial heterogeneity between studies on the same topic. Classic meta-analysis lacks the power to assess more than a handful of univariate moderators. MetaForest, by contrast, has substantial power to explore heterogeneity in meta-analysis. It can identify important moderators from a larger set of potential candidates (Van Lissa, 2020). This is an appealing quality, because many meta-analyses have small sample sizes. Moreover, MetaForest yields a measure of variable importance which can be used to identify important moderators, and offers partial prediction plots to explore the shape of the marginal relationship between moderators and effect size.

tree.interpreter — by Qingyao Sun, 5 years ago

Random Forest Prediction Decomposition and Feature Importance Measure

An R re-implementation of the 'treeinterpreter' package on PyPI < https://pypi.org/project/treeinterpreter/>. Each prediction can be decomposed as 'prediction = bias + feature_1_contribution + ... + feature_n_contribution'. This decomposition is then used to calculate the Mean Decrease Impurity (MDI) and Mean Decrease Impurity using out-of-bag samples (MDI-oob) feature importance measures based on the work of Li et al. (2019) .

iRafNet — by Francesca Petralia, 8 years ago

Integrative Random Forest for Gene Regulatory Network Inference

Provides a flexible integrative algorithm that allows information from prior data, such as protein protein interactions and gene knock-down, to be jointly considered for gene regulatory network inference.

blockForest — by Marvin N. Wright, 2 years ago

Block Forests: Random Forests for Blocks of Clinical and Omics Covariate Data

A random forest variant 'block forest' ('BlockForest') tailored to the prediction of binary, survival and continuous outcomes using block-structured covariate data, for example, clinical covariates plus measurements of a certain omics data type or multi-omics data, that is, data for which measurements of different types of omics data and/or clinical data for each patient exist. Examples of different omics data types include gene expression measurements, mutation data and copy number variation measurements. Block forest are presented in Hornung & Wright (2019). The package includes four other random forest variants for multi-omics data: 'RandomBlock', 'BlockVarSel', 'VarProb', and 'SplitWeights'. These were also considered in Hornung & Wright (2019), but performed worse than block forest in their comparison study based on 20 real multi-omics data sets. Therefore, we recommend to use block forest ('BlockForest') in applications. The other random forest variants can, however, be consulted for academic purposes, for example, in the context of further methodological developments. Reference: Hornung, R. & Wright, M. N. (2019) Block Forests: random forests for blocks of clinical and omics covariate data. BMC Bioinformatics 20:358. .

obliqueRSF — by Byron Jaeger, 2 years ago

Oblique Random Forests for Right-Censored Time-to-Event Data

Oblique random survival forests incorporate linear combinations of input variables into random survival forests (Ishwaran, 2008 ). Regularized Cox proportional hazard models (Simon, 2016 ) are used to identify optimal linear combinations of input variables.