Lightweight Data Quality Simulation for Classification

Data quality simulation can be used to check the robustness of data analysis findings and learn about the impact of data quality contaminations on classification. This package helps to add contaminations (noise, missing values, outliers, low variance, irrelevant features, class swap (inconsistency), class imbalance and decrease in data volume) to data and then evaluate the simulated data sets for classification accuracy. As a lightweight solution simulation runs can be set up with no or minimal up-front effort.


News

Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.

install.packages("preprosim")

0.2.0 by Markus Vattulainen, 9 months ago


https://github.com/mvattulainen/preprosim


Report a bug at https://github.com/mvattulainen/preprosim/issues


Browse source code at https://github.com/cran/preprosim


Authors: Markus Vattulainen [aut, cre]


Documentation:   PDF Manual  


GPL-2 license


Imports DMwR, reshape2, ggplot2, methods, stats, caret, doParallel, foreach, e1071

Suggests gbm, preprocomb, preproviz, knitr, rmarkdown


See at CRAN