Found 1641 packages in 0.71 seconds
Random Forests, Linear Trees, and Gradient Boosting for Inference and Interpretability
Provides fast implementations of Honest Random Forests,
Gradient Boosting, and Linear Random Forests, with an emphasis on inference
and interpretability. Additionally contains methods for variable
importance, out-of-bag prediction, regression monotonicity, and
several methods for missing data imputation. Soren R. Kunzel,
Theo F. Saarinen, Edward W. Liu, Jasjeet S. Sekhon (2019)
Sequential Permutation Testing of Random Forest Variable Importance Measures
Sequential permutation testing for statistical significance of predictors in random forests. The main function of the package is rfvimptest(), which allows to test for the statistical significance of predictors in random forests using different (sequential) permutation test strategies. The advantage of sequential over conventional permutation tests is that they are computationally considerably less intensive, as the sequential procedure is stopped as soon as there is sufficient evidence for either the null or the alternative hypothesis.
Integrated Prediction using Uni-Variate and Multivariate Random Forests
An implementation of a framework for drug sensitivity prediction from various genetic characterizations using ensemble approaches. Random Forests or Multivariate Random Forest predictive models can be generated from each genetic characterization that are then combined using a Least Square Regression approach. It also provides options for the use of different error estimation approaches of Leave-one-out, Bootstrap, N-fold cross validation and 0.632+Bootstrap along with generation of prediction confidence interval using Jackknife-after-Bootstrap approach.
A Random-Forest-Based Approach for Imputing Clustered Incomplete Data
It offers random-forest-based functions to impute clustered incomplete data. The package is tailored for but not limited to imputing multitissue expression data, in which a gene's expression is measured on the collected tissues of an individual but missing on the uncollected tissues.
Estimate Permutation p-Values for Random Forest Importance Metrics
Estimate significance of importance metrics for a Random Forest model by permuting the response variable. Produces null distribution of importance metrics for each predictor variable and p-value of observed. Provides summary and visualization functions for 'randomForest' results.
Tune Random Forests Based on Variable Importance & Plot Results
Functions for assessing variable relations and associations prior to modeling with a Random Forest algorithm (although these are relevant for any predictive model). Metrics such as partial correlations and variance inflation factors are tabulated as well as plotted for the user. A function is available for tuning the main Random Forest hyper-parameter based on model performance and variable importance metrics. This grid-search technique provides tables and plots showing the effect of the main hyper-parameter on each of the assessment metrics. It also returns each of the evaluated models to the user. The package also provides superior variable importance plots for individual models. All of the plots are developed so that the user has the ability to edit and improve further upon the plots. Derivations and methodology are described in Bladen (2022) < https://digitalcommons.usu.edu/etd/8587/>.
QSAR Modeling with Multiple Algorithms: MLR, PLS, and Random Forest
Quantitative Structure-Activity Relationship (QSAR) modeling is a valuable tool in computational chemistry and drug design, where it aims to predict the activity or property of chemical compounds based on their molecular structure. In this vignette, we present the 'rQSAR' package, which provides functions for variable selection and QSAR modeling using Multiple Linear Regression (MLR), Partial Least Squares (PLS), and Random Forest algorithms.
Merging of Satellite Datasets with Ground Observations using Random Forests
S3 implementation of the Random Forest MErging Procedure (RF-MEP), which combines two or more satellite-based datasets (e.g., precipitation products, topography) with ground observations to produce a new dataset with improved spatio-temporal distribution of the target field. In particular, this package was developed to merge different Satellite-based Rainfall Estimates (SREs) with measurements from rain gauges, in order to obtain a new precipitation dataset where the time series in the rain gauges are used to correct different types of errors present in the SREs. However, this package might be used to merge other hydrological/environmental satellite fields with point observations. For details, see Baez-Villanueva et al. (2020)
Missing Value Imputation using Random Forest for Prediction Settings
Missing data imputation based on the 'missForest' algorithm (Stekhoven, Daniel J (2012)
Optimising Random Forest Stability Through Selection of the Optimal Number of Trees
Calculating the stability of random forest with certain numbers of trees. The non-linear relationship between stability and numbers of trees is described using a logistic regression model and used to estimate the optimal number of trees.