LGAICVFeb 23, 2022

When do GANs replicate? On the choice of dataset size

arXiv:2202.11765v163 citations
Originality Incremental advance
AI Analysis

This provides a practical tool for estimating minimal dataset sizes to prevent GAN replication, aiding in dataset construction and selection for researchers and practitioners.

The study investigates how dataset size and complexity affect GAN replication and perceptual quality, finding that replication percentage decays exponentially with dataset size and complexity, while image quality follows a U-shape trend.

Do GANs replicate training images? Previous studies have shown that GANs do not seem to replicate training data without significant change in the training procedure. This leads to a series of research on the exact condition needed for GANs to overfit to the training data. Although a number of factors has been theoretically or empirically identified, the effect of dataset size and complexity on GANs replication is still unknown. With empirical evidence from BigGAN and StyleGAN2, on datasets CelebA, Flower and LSUN-bedroom, we show that dataset size and its complexity play an important role in GANs replication and perceptual quality of the generated images. We further quantify this relationship, discovering that replication percentage decays exponentially with respect to dataset size and complexity, with a shared decaying factor across GAN-dataset combinations. Meanwhile, the perceptual image quality follows a U-shape trend w.r.t dataset size. This finding leads to a practical tool for one-shot estimation on minimal dataset size to prevent GAN replication which can be used to guide datasets construction and selection.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes