Examples: visualization, C++, networks, data cleaning, html widgets, ropensci.

Found 1641 packages in 0.71 seconds

Rforestry — by Theo Saarinen, 2 years ago

Random Forests, Linear Trees, and Gradient Boosting for Inference and Interpretability

Provides fast implementations of Honest Random Forests, Gradient Boosting, and Linear Random Forests, with an emphasis on inference and interpretability. Additionally contains methods for variable importance, out-of-bag prediction, regression monotonicity, and several methods for missing data imputation. Soren R. Kunzel, Theo F. Saarinen, Edward W. Liu, Jasjeet S. Sekhon (2019) .

rfvimptest — by Roman Hornung, 2 years ago

Sequential Permutation Testing of Random Forest Variable Importance Measures

Sequential permutation testing for statistical significance of predictors in random forests. The main function of the package is rfvimptest(), which allows to test for the statistical significance of predictors in random forests using different (sequential) permutation test strategies. The advantage of sequential over conventional permutation tests is that they are computationally considerably less intensive, as the sequential procedure is stopped as soon as there is sufficient evidence for either the null or the alternative hypothesis.

IntegratedMRF — by Raziur Rahman, 6 years ago

Integrated Prediction using Uni-Variate and Multivariate Random Forests

An implementation of a framework for drug sensitivity prediction from various genetic characterizations using ensemble approaches. Random Forests or Multivariate Random Forest predictive models can be generated from each genetic characterization that are then combined using a Least Square Regression approach. It also provides options for the use of different error estimation approaches of Leave-one-out, Bootstrap, N-fold cross validation and 0.632+Bootstrap along with generation of prediction confidence interval using Jackknife-after-Bootstrap approach.

MixRF — by Jiebiao Wang, 9 years ago

A Random-Forest-Based Approach for Imputing Clustered Incomplete Data

It offers random-forest-based functions to impute clustered incomplete data. The package is tailored for but not limited to imputing multitissue expression data, in which a gene's expression is measured on the collected tissues of an individual but missing on the uncollected tissues.

rfPermute — by Eric Archer, a year ago

Estimate Permutation p-Values for Random Forest Importance Metrics

Estimate significance of importance metrics for a Random Forest model by permuting the response variable. Produces null distribution of importance metrics for each predictor variable and p-value of observed. Provides summary and visualization functions for 'randomForest' results.

randomForestVIP — by Kelvyn Bladen, a year ago

Tune Random Forests Based on Variable Importance & Plot Results

Functions for assessing variable relations and associations prior to modeling with a Random Forest algorithm (although these are relevant for any predictive model). Metrics such as partial correlations and variance inflation factors are tabulated as well as plotted for the user. A function is available for tuning the main Random Forest hyper-parameter based on model performance and variable importance metrics. This grid-search technique provides tables and plots showing the effect of the main hyper-parameter on each of the assessment metrics. It also returns each of the evaluated models to the user. The package also provides superior variable importance plots for individual models. All of the plots are developed so that the user has the ability to edit and improve further upon the plots. Derivations and methodology are described in Bladen (2022) < https://digitalcommons.usu.edu/etd/8587/>.

rQSAR — by Oche Ambrose George, 8 months ago

QSAR Modeling with Multiple Algorithms: MLR, PLS, and Random Forest

Quantitative Structure-Activity Relationship (QSAR) modeling is a valuable tool in computational chemistry and drug design, where it aims to predict the activity or property of chemical compounds based on their molecular structure. In this vignette, we present the 'rQSAR' package, which provides functions for variable selection and QSAR modeling using Multiple Linear Regression (MLR), Partial Least Squares (PLS), and Random Forest algorithms.

RFmerge — by Mauricio Zambrano-Bigiarini, 4 years ago

Merging of Satellite Datasets with Ground Observations using Random Forests

S3 implementation of the Random Forest MErging Procedure (RF-MEP), which combines two or more satellite-based datasets (e.g., precipitation products, topography) with ground observations to produce a new dataset with improved spatio-temporal distribution of the target field. In particular, this package was developed to merge different Satellite-based Rainfall Estimates (SREs) with measurements from rain gauges, in order to obtain a new precipitation dataset where the time series in the rain gauges are used to correct different types of errors present in the SREs. However, this package might be used to merge other hydrological/environmental satellite fields with point observations. For details, see Baez-Villanueva et al. (2020) . Bugs / comments / questions / collaboration of any kind are very welcomed.

missForestPredict — by Elena Albu, a year ago

Missing Value Imputation using Random Forest for Prediction Settings

Missing data imputation based on the 'missForest' algorithm (Stekhoven, Daniel J (2012) ) with adaptations for prediction settings. The function missForest() is used to impute a (training) dataset with missing values and to learn imputation models that can be later used for imputing new observations. The function missForestPredict() is used to impute one or multiple new observations (test set) using the models learned on the training data.

optRF — by Thomas Martin Lange, 2 months ago

Optimising Random Forest Stability Through Selection of the Optimal Number of Trees

Calculating the stability of random forest with certain numbers of trees. The non-linear relationship between stability and numbers of trees is described using a logistic regression model and used to estimate the optimal number of trees.