CVAIGRLGMar 18, 2019

Semantic Image Synthesis with Spatially-Adaptive Normalization

arXiv:1903.07291v23131 citationsHas Code
Originality Incremental advance
AI Analysis

This work solves the problem of generating realistic images from semantic maps for applications in computer vision and graphics, representing an incremental improvement over previous methods.

The paper tackles the problem of synthesizing photorealistic images from semantic layouts by addressing the issue where normalization layers wash away semantic information, resulting in improved visual fidelity and alignment with input layouts across several challenging datasets.

We propose spatially-adaptive normalization, a simple but effective layer for synthesizing photorealistic images given an input semantic layout. Previous methods directly feed the semantic layout as input to the deep network, which is then processed through stacks of convolution, normalization, and nonlinearity layers. We show that this is suboptimal as the normalization layers tend to ``wash away'' semantic information. To address the issue, we propose using the input layout for modulating the activations in normalization layers through a spatially-adaptive, learned transformation. Experiments on several challenging datasets demonstrate the advantage of the proposed method over existing approaches, regarding both visual fidelity and alignment with input layouts. Finally, our model allows user control over both semantic and style. Code is available at https://github.com/NVlabs/SPADE .

Code Implementations24 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes