Data Augmentation for Machine Learning on Tabular Data

Implementation of a data augmentation technique based on conditional entropy It was devised by both authors during their masters and is discussed in detail in the second author dissertation. It is able to create novel samples conditioned on a desired value of a categorical attribute, as a way to augment data for classification tasks Tests discussed in the dissertation and future paper present that the technique satisfies several statistical assumptions for the novel samples. It also shows significant improvement for machine learning models trained on small data.


Reference manual

It appears you don't have a PDF plugin for this browser. You can click here to download the reference manual.


0.1.0 by Rafael S. Pereira, 7 months ago

Browse source code at

Authors: Rafael S. Pereira [aut, cre, cph] , Henrique Matheus ferreira da silva [aut, cph] , Fabio A.M Porto [aut, ths, cph]

Documentation:   PDF Manual  

MIT + file LICENSE license

Suggests knitr, ggplot2, markdown

See at CRAN