Is synthetic data from generative models ready for image recognition?
This addresses the problem of data scarcity in image recognition for researchers and practitioners, though it is incremental in exploring existing generative models.
The study investigates the applicability of synthetic images from text-to-image models for image recognition, finding they can improve classification in data-scarce settings and aid in large-scale pre-training, while highlighting their limitations and proposing strategies for better use.
Recent text-to-image generation models have shown promising results in generating high-fidelity photo-realistic images. Though the results are astonishing to human eyes, how applicable these generated images are for recognition tasks remains under-explored. In this work, we extensively study whether and how synthetic images generated from state-of-the-art text-to-image generation models can be used for image recognition tasks, and focus on two perspectives: synthetic data for improving classification models in data-scarce settings (i.e. zero-shot and few-shot), and synthetic data for large-scale model pre-training for transfer learning. We showcase the powerfulness and shortcomings of synthetic data from existing generative models, and propose strategies for better applying synthetic data for recognition tasks. Code: https://github.com/CVMI-Lab/SyntheticData.