CVApr 9, 2018

Recovering Realistic Texture in Image Super-resolution by Deep Spatial Feature Transform

arXiv:1804.02815v11164 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of texture fidelity in image super-resolution for applications requiring high-quality visual outputs, representing an incremental improvement over existing methods.

The paper tackles the problem of recovering natural and realistic texture in single-image super-resolution (SR) by modulating features in a convolutional neural network conditioned on semantic segmentation maps, resulting in more realistic and visually pleasing textures compared to state-of-the-art methods like SRGAN and EnhanceNet.

Despite that convolutional neural networks (CNN) have recently demonstrated high-quality reconstruction for single-image super-resolution (SR), recovering natural and realistic texture remains a challenging problem. In this paper, we show that it is possible to recover textures faithful to semantic classes. In particular, we only need to modulate features of a few intermediate layers in a single network conditioned on semantic segmentation probability maps. This is made possible through a novel Spatial Feature Transform (SFT) layer that generates affine transformation parameters for spatial-wise feature modulation. SFT layers can be trained end-to-end together with the SR network using the same loss function. During testing, it accepts an input image of arbitrary size and generates a high-resolution image with just a single forward pass conditioned on the categorical priors. Our final results show that an SR network equipped with SFT can generate more realistic and visually pleasing textures in comparison to state-of-the-art SRGAN and EnhanceNet.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes