Nonparametric Missing Value Imputation using Random Forest

The function 'missForest' in this package is used to impute missing values particularly in the case of mixed-type data. It uses a random forest trained on the observed values of a data matrix to predict the missing values. It can be used to impute continuous and/or categorical data including complex interactions and non-linear relations. It yields an out-of-bag (OOB) imputation error estimate without the need of a test set or elaborate cross-validation. It can be run in parallel to save computation time.


missForest is a nonparametric, mixed-type imputation method for basically any type of data.
Here, we host the R-package "missForest" for the statistical software R.

The method is based on the publication Stekhoven and B├╝hlmann, 2012. The R package contains a vignette on how to use "missForest" in R including many helpful examples. Upcoming innovations:

  • use of prediction
  • 'real' multiple imputation

Contact me by email: [email protected] References: Stekhoven, D.J. and Buehlmann, P. (2012), 'MissForest - nonparametric missing value imputation for mixed-type data', Bioinformatics, 28(1) 2012, 112-118, doi: 10.1093/bioinformatics/btr597


Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.