CVNov 29, 2023

Improving Feature Stability during Upsampling -- Spectral Artifacts and the Importance of Spatial Context

arXiv:2311.17524v217 citationsh-index: 17
Originality Synthesis-oriented
AI Analysis

This addresses stability issues in tasks like image restoration and segmentation, offering an incremental improvement by leveraging spatial context rather than introducing new methods.

The paper tackles the problem of spectral artifacts during upsampling in neural networks for pixel-wise predictions, finding that providing large spatial context enables stable and high-quality outputs without requiring explicit anti-aliasing filters.

Pixel-wise predictions are required in a wide variety of tasks such as image restoration, image segmentation, or disparity estimation. Common models involve several stages of data resampling, in which the resolution of feature maps is first reduced to aggregate information and then increased to generate a high-resolution output. Previous works have shown that resampling operations are subject to artifacts such as aliasing. During downsampling, aliases have been shown to compromise the prediction stability of image classifiers. During upsampling, they have been leveraged to detect generated content. Yet, the effect of aliases during upsampling has not yet been discussed w.r.t. the stability and robustness of pixel-wise predictions. While falling under the same term (aliasing), the challenges for correct upsampling in neural networks differ significantly from those during downsampling: when downsampling, some high frequencies can not be correctly represented and have to be removed to avoid aliases. However, when upsampling for pixel-wise predictions, we actually require the model to restore such high frequencies that can not be encoded in lower resolutions. The application of findings from signal processing is therefore a necessary but not a sufficient condition to achieve the desirable output. In contrast, we find that the availability of large spatial context during upsampling allows to provide stable, high-quality pixel-wise predictions, even when fully learning all filter weights.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes