Data Augmentation with Variational Autoencoders and Manifold Sampling
This addresses the problem of limited data for machine learning practitioners, offering an incremental improvement in data augmentation techniques.
The paper tackles data augmentation in low sample size settings by proposing a new sampling method for Variational Autoencoders, achieving a balanced accuracy improvement from 80.7% to 88.6% on the OASIS database when using synthetic data.
We propose a new efficient way to sample from a Variational Autoencoder in the challenging low sample size setting. This method reveals particularly well suited to perform data augmentation in such a low data regime and is validated across various standard and real-life data sets. In particular, this scheme allows to greatly improve classification results on the OASIS database where balanced accuracy jumps from 80.7% for a classifier trained with the raw data to 88.6% when trained only with the synthetic data generated by our method. Such results were also observed on 3 standard data sets and with other classifiers. A code is available at https://github.com/clementchadebec/Data_Augmentation_with_VAE-DALI.