Learning to Generate Novel Scene Compositions from Single Images and Videos
This addresses the problem of data scarcity in generative modeling for researchers and practitioners, offering an incremental improvement over existing single-image GAN methods.
The paper tackles the challenge of training GANs with minimal data, such as a single image or video, by introducing One-Shot GAN, which uses a two-branch discriminator to separate content and layout realism, enabling synthesis of novel scene compositions with higher diversity and quality compared to prior single-image GAN models.
Training GANs in low-data regimes remains a challenge, as overfitting often leads to memorization or training divergence. In this work, we introduce One-Shot GAN that can learn to generate samples from a training set as little as one image or one video. We propose a two-branch discriminator, with content and layout branches designed to judge the internal content separately from the scene layout realism. This allows synthesis of visually plausible, novel compositions of a scene, with varying content and layout, while preserving the context of the original sample. Compared to previous single-image GAN models, One-Shot GAN achieves higher diversity and quality of synthesis. It is also not restricted to the single image setting, successfully learning in the introduced setting of a single video.