Effective Data Augmentation with Multi-Domain Learning GANs
This addresses the problem of data scarcity for deep learning practitioners, offering an incremental improvement over existing GAN-based augmentation methods.
The paper tackles the high cost of data collection and labeling in deep learning by proposing Domain Fusion, a GAN-based data augmentation method that uses multi-domain learning to generate high-fidelity samples for target tasks, achieving better classification accuracy than fine-tuned GANs on datasets like CIFAR-100 with only 5,000 images.
For deep learning applications, the massive data development (e.g., collecting, labeling), which is an essential process in building practical applications, still incurs seriously high costs. In this work, we propose an effective data augmentation method based on generative adversarial networks (GANs), called Domain Fusion. Our key idea is to import the knowledge contained in an outer dataset to a target model by using a multi-domain learning GAN. The multi-domain learning GAN simultaneously learns the outer and target dataset and generates new samples for the target tasks. The simultaneous learning process makes GANs generate the target samples with high fidelity and variety. As a result, we can obtain accurate models for the target tasks by using these generated samples even if we only have an extremely low volume target dataset. We experimentally evaluate the advantages of Domain Fusion in image classification tasks on 3 target datasets: CIFAR-100, FGVC-Aircraft, and Indoor Scene Recognition. When trained on each target dataset reduced the samples to 5,000 images, Domain Fusion achieves better classification accuracy than the data augmentation using fine-tuned GANs. Furthermore, we show that Domain Fusion improves the quality of generated samples, and the improvements can contribute to higher accuracy.