CVAug 30, 2023

Improving Few-shot Image Generation by Structural Discrimination and Textural Modulation

arXiv:2308.16110v16 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work addresses the challenge of generating diverse and high-quality images from limited examples, which is incremental but offers practical improvements for computer vision applications.

The paper tackled the problem of few-shot image generation by proposing a textural modulation mechanism and structural discriminator to improve semantic fusion and layout fidelity, achieving state-of-the-art synthesis performance on three datasets.

Few-shot image generation, which aims to produce plausible and diverse images for one category given a few images from this category, has drawn extensive attention. Existing approaches either globally interpolate different images or fuse local representations with pre-defined coefficients. However, such an intuitive combination of images/features only exploits the most relevant information for generation, leading to poor diversity and coarse-grained semantic fusion. To remedy this, this paper proposes a novel textural modulation (TexMod) mechanism to inject external semantic signals into internal local representations. Parameterized by the feedback from the discriminator, our TexMod enables more fined-grained semantic injection while maintaining the synthesis fidelity. Moreover, a global structural discriminator (StructD) is developed to explicitly guide the model to generate images with reasonable layout and outline. Furthermore, the frequency awareness of the model is reinforced by encouraging the model to distinguish frequency signals. Together with these techniques, we build a novel and effective model for few-shot image generation. The effectiveness of our model is identified by extensive experiments on three popular datasets and various settings. Besides achieving state-of-the-art synthesis performance on these datasets, our proposed techniques could be seamlessly integrated into existing models for a further performance boost.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes