CVApr 13, 2021

DatasetGAN: Efficient Labeled Data Factory with Minimal Human Effort

arXiv:2104.06490v2392 citations
AI Analysis

This method addresses the data-hungry nature of deep networks by enabling efficient dataset creation for computer vision, reducing annotation effort significantly.

DatasetGAN tackles the problem of generating large-scale, high-quality semantically segmented image datasets with minimal human annotation by leveraging GANs to produce infinite annotated data from a few labeled examples. It achieves performance on par with fully supervised methods while requiring up to 100x less annotated data, as demonstrated on tasks like human face and car part segmentation.

We introduce DatasetGAN: an automatic procedure to generate massive datasets of high-quality semantically segmented images requiring minimal human effort. Current deep networks are extremely data-hungry, benefiting from training on large-scale datasets, which are time consuming to annotate. Our method relies on the power of recent GANs to generate realistic images. We show how the GAN latent code can be decoded to produce a semantic segmentation of the image. Training the decoder only needs a few labeled examples to generalize to the rest of the latent space, resulting in an infinite annotated dataset generator! These generated datasets can then be used for training any computer vision architecture just as real datasets are. As only a few images need to be manually segmented, it becomes possible to annotate images in extreme detail and generate datasets with rich object and part segmentations. To showcase the power of our approach, we generated datasets for 7 image segmentation tasks which include pixel-level labels for 34 human face parts, and 32 car parts. Our approach outperforms all semi-supervised baselines significantly and is on par with fully supervised methods, which in some cases require as much as 100x more annotated data as our method.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes