CVDec 5, 2020

Spatially-Adaptive Pixelwise Networks for Fast Image Translation

arXiv:2012.02992v188 citations
AI Analysis

This work provides a significant speed improvement for researchers and practitioners working on real-time image-to-image translation applications.

This paper introduces a new generator architecture for fast and efficient high-resolution image-to-image translation. The proposed model achieves up to an 18x speedup compared to state-of-the-art baselines while maintaining comparable visual quality across various image resolutions and translation domains.

We introduce a new generator architecture, aimed at fast and efficient high-resolution image-to-image translation. We design the generator to be an extremely lightweight function of the full-resolution image. In fact, we use pixel-wise networks; that is, each pixel is processed independently of others, through a composition of simple affine transformations and nonlinearities. We take three important steps to equip such a seemingly simple function with adequate expressivity. First, the parameters of the pixel-wise networks are spatially varying so they can represent a broader function class than simple 1x1 convolutions. Second, these parameters are predicted by a fast convolutional network that processes an aggressively low-resolution representation of the input; Third, we augment the input image with a sinusoidal encoding of spatial coordinates, which provides an effective inductive bias for generating realistic novel high-frequency image content. As a result, our model is up to 18x faster than state-of-the-art baselines. We achieve this speedup while generating comparable visual quality across different image resolutions and translation domains.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes