Conditional Transferring Features: Scaling GANs to Thousands of Classes with 30% Less High-quality Data for Training
This addresses the challenge of data efficiency and scalability in generative models for applications like image synthesis and handwriting generation, though it appears incremental as it builds on existing GAN frameworks.
The paper tackles the problem of scaling GANs to thousands of classes while reducing the need for high-quality training data, achieving results such as outperforming previous methods even with 30% fewer high-quality images on datasets like CIFAR-10 and STL-10, and generating 1,000 ImageNet classes and 3,755 Chinese handwriting characters with the best quality.
Generative adversarial network (GAN) has greatly improved the quality of unsupervised image generation. Previous GAN-based methods often require a large amount of high-quality training data while producing a small number (e.g., tens) of classes. This work aims to scale up GANs to thousands of classes meanwhile reducing the use of high-quality data in training. We propose an image generation method based on conditional transferring features, which can capture pixel-level semantic changes when transforming low-quality images into high-quality ones. Moreover, self-supervision learning is integrated into our GAN architecture to provide more label-free semantic supervisory information observed from the training data. As such, training our GAN architecture requires much fewer high-quality images with a small number of additional low-quality images. The experiments on CIFAR-10 and STL-10 show that even removing 30% high-quality images from the training set, our method can still outperform previous ones. The scalability on object classes has been experimentally validated: our method with 30% fewer high-quality images obtains the best quality in generating 1,000 ImageNet classes, as well as generating all 3,755 classes of CASIA-HWDB1.0 Chinese handwriting characters.