CVAIApr 1, 2021

Linear Semantics in Generative Adversarial Networks

arXiv:2104.00487v139 citations
Originality Incremental advance
AI Analysis

This enables more precise semantic control in image generation for users, though it is incremental as it builds on existing GAN frameworks.

The authors tackled the problem of explicitly controlling semantics in GAN-generated images by discovering that GANs encode semantics linearly in feature maps, enabling semantic segmentation with as few as 8 labeled images and proposing few-shot editing methods.

Generative Adversarial Networks (GANs) are able to generate high-quality images, but it remains difficult to explicitly specify the semantics of synthesized images. In this work, we aim to better understand the semantic representation of GANs, and thereby enable semantic control in GAN's generation process. Interestingly, we find that a well-trained GAN encodes image semantics in its internal feature maps in a surprisingly simple way: a linear transformation of feature maps suffices to extract the generated image semantics. To verify this simplicity, we conduct extensive experiments on various GANs and datasets; and thanks to this simplicity, we are able to learn a semantic segmentation model for a trained GAN from a small number (e.g., 8) of labeled images. Last but not least, leveraging our findings, we propose two few-shot image editing approaches, namely Semantic-Conditional Sampling and Semantic Image Editing. Given a trained GAN and as few as eight semantic annotations, the user is able to generate diverse images subject to a user-provided semantic layout, and control the synthesized image semantics. We have made the code publicly available.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes