Phased Data Augmentation for Training a Likelihood-Based Generative Model with Limited Data
This addresses the challenge of data-efficient training for generative models beyond GANs, particularly in domains with costly data collection, though it is incremental as it extends existing augmentation techniques to new model types.
The paper tackles the problem of training likelihood-based generative models with limited data by introducing phased data augmentation, which optimizes training without altering data distribution, resulting in superior performance in quantitative and qualitative evaluations across diverse datasets.
Generative models excel in creating realistic images, yet their dependency on extensive datasets for training presents significant challenges, especially in domains where data collection is costly or challenging. Current data-efficient methods largely focus on GAN architectures, leaving a gap in training other types of generative models. Our study introduces "phased data augmentation" as a novel technique that addresses this gap by optimizing training in limited data scenarios without altering the inherent data distribution. By limiting the augmentation intensity throughout the learning phases, our method enhances the model's ability to learn from limited data, thus maintaining fidelity. Applied to a model integrating PixelCNNs with VQ-VAE-2, our approach demonstrates superior performance in both quantitative and qualitative evaluations across diverse datasets. This represents an important step forward in the efficient training of likelihood-based models, extending the usefulness of data augmentation techniques beyond just GANs.