Examples: visualization, C++, networks, data cleaning, html widgets, ropensci.

Found 1881 packages in 0.17 seconds

RFmstate — by Yiqing Chen, 22 days ago

Random Forest-Based Multistate Survival Analysis

Fits cause-specific random survival forests for flexible multistate survival analysis with covariate-adjusted transition probabilities computed via product-integral. State transitions are modeled by random forests. Subject-specific transition probability matrices are assembled from predicted cumulative hazards using the product-integral formula. Also provides a standalone Aalen-Johansen nonparametric estimator as a covariate-free baseline. Supports arbitrary state spaces with any number of states (three or more) and any set of allowed transitions, applicable to clinical trials, disease progression, reliability engineering, and other domains where subjects move among discrete states over time. Provides per-transition feature importance, bias-variance diagnostics, and comprehensive visualizations. Handles right censoring and competing transitions. Methods are described in Ishwaran et al. (2008) for random survival forests, Putter et al. (2007) for multistate competing risks decomposition, and Aalen and Johansen (1978) < https://www.jstor.org/stable/4615704> for the nonparametric estimator.

literanger — by Stephen Wade, 9 months ago

Fast Serializable Random Forests Based on 'ranger'

An updated implementation of R package 'ranger' by Wright et al, (2017) for training and predicting from random forests, particularly suited to high-dimensional data, and for embedding in 'Multiple Imputation by Chained Equations' (MICE) by van Buuren (2007) . Ensembles of classification and regression trees are currently supported. Sparse data of class 'dgCMatrix' (R package 'Matrix') can be directly analyzed. Conventional bagged predictions are available alongside an efficient prediction for MICE via the algorithm proposed by Doove et al (2014) . Trained forests can be written to and read from storage. Survival and probability forests are not supported in the update, nor is data of class 'gwaa.data' (R package 'GenABEL'); use the original 'ranger' package for these analyses.

abcrf — by Jean-Michel Marin, 4 months ago

Approximate Bayesian Computation via Random Forests

Performs Approximate Bayesian Computation (ABC) model choice and parameter inference via random forests. Pudlo P., Marin J.-M., Estoup A., Cornuet J.-M., Gautier M. and Robert C. P. (2016) . Raynal L., Marin J.-M., Pudlo P., Ribatet M., Robert C. P. and Estoup A. (2019) .

steprf — by Jin Li, 4 years ago

Stepwise Predictive Variable Selection for Random Forest

An introduction to several novel predictive variable selection methods for random forest. They are based on various variable importance methods (i.e., averaged variable importance (AVI), and knowledge informed AVI (i.e., KIAVI, and KIAVI2)) and predictive accuracy in stepwise algorithms. For details of the variable selection methods, please see: Li, J., Siwabessy, J., Huang, Z. and Nichol, S. (2019) . Li, J., Alvarez, B., Siwabessy, J., Tran, M., Huang, Z., Przeslawski, R., Radke, L., Howard, F., Nichol, S. (2017). .

moreparty — by Nicolas Robette, 7 months ago

A Toolbox for Conditional Inference Trees and Random Forests

Additions to 'party' and 'partykit' packages : tools for the interpretation of forests (surrogate trees, prototypes, etc.), feature selection (see Gregorutti et al (2017) , Hapfelmeier and Ulm (2013) , Altmann et al (2010) ) and parallelized versions of conditional forest and variable importance functions. Also modules and a shiny app for conditional inference trees.

MulvariateRandomForestVarImp — by Dogonadze Nika, 4 years ago

Variable Importance Measures for Multivariate Random Forests

Calculates two sets of post-hoc variable importance measures for multivariate random forests. The first set of variable importance measures are given by the sum of mean split improvements for splits defined by feature j measured on user-defined examples (i.e., training or testing samples). The second set of importance measures are calculated on a per-outcome variable basis as the sum of mean absolute difference of node values for each split defined by feature j measured on user-defined examples (i.e., training or testing samples). The user can optionally threshold both sets of importance measures to include only splits that are statistically significant as measured using an F-test.

forestError — by Benjamin Lu, 5 years ago

A Unified Framework for Random Forest Prediction Error Estimation

Estimates the conditional error distributions of random forest predictions and common parameters of those distributions, including conditional misclassification rates, conditional mean squared prediction errors, conditional biases, and conditional quantiles, by out-of-bag weighting of out-of-bag prediction errors as proposed by Lu and Hardin (2021). This package is compatible with several existing packages that implement random forests in R.

RFlocalfdr — by Robert Dunne, a year ago

Significance Level for Random Forest Impurity Importance Scores

Sets a significance level for Random Forest MDI (Mean Decrease in Impurity, Gini or sum of squares) variable importance scores, using an empirical Bayes approach. See Dunne et al. (2022) .

roseRF — by Elliot H. Young, a year ago

ROSE Random Forests for Robust Semiparametric Efficient Estimation

ROSE (RObust Semiparametric Efficient) random forests for robust semiparametric efficient estimation in partially parametric models (containing generalised partially linear models). Details can be found in the paper by Young and Shah (2024) .

Sstack — by Kevin Matlock, 8 years ago

Bootstrap Stacking of Random Forest Models for Heterogeneous Data

Generates and predicts a set of linearly stacked Random Forest models using bootstrap sampling. Individual datasets may be heterogeneous (not all samples have full sets of features). Contains support for parallelization but the user should register their cores before running. This is an extension of the method found in Matlock (2018) .