Peer is Your Pillar: A Data-unbalanced Conditional GANs for Few-shot Image Generation
This work addresses a domain-specific problem in computer vision for researchers and practitioners needing efficient few-shot image generation, but it is incremental as it builds on existing GAN and transfer learning methods.
The paper tackles the problem of few-shot image generation where existing transfer learning methods lack control over knowledge preservation from source models when domains are unrelated, proposing a pipeline called Peer is Your Pillar (PIP) that combines target and peer datasets for data-unbalanced conditional generation, resulting in reduced training requirements as demonstrated on various datasets.
Few-shot image generation aims to train generative models using a small number of training images. When there are few images available for training (e.g. 10 images), Learning From Scratch (LFS) methods often generate images that closely resemble the training data while Transfer Learning (TL) methods try to improve performance by leveraging prior knowledge from GANs pre-trained on large-scale datasets. However, current TL methods may not allow for sufficient control over the degree of knowledge preservation from the source model, making them unsuitable for setups where the source and target domains are not closely related. To address this, we propose a novel pipeline called Peer is your Pillar (PIP), which combines a target few-shot dataset with a peer dataset to create a data-unbalanced conditional generation. Our approach includes a class embedding method that separates the class space from the latent space, and we use a direction loss based on pre-trained CLIP to improve image diversity. Experiments on various few-shot datasets demonstrate the advancement of the proposed PIP, especially reduces the training requirements of few-shot image generation.