CVMar 30, 2024

Look-Around Before You Leap: High-Frequency Injected Transformer for Image Restoration

arXiv:2404.00279v13 citationsh-index: 4Has Code
Originality Incremental advance
AI Analysis

This work addresses a key bottleneck in image restoration for computer vision applications, offering an incremental improvement over existing Transformer methods by enhancing local detail preservation.

The paper tackles the limitation of Transformer-based models in capturing local information for image restoration by proposing HIT, a High-frequency Injected Transformer that incorporates high-frequency details and bidirectional interactions, achieving state-of-the-art performance across 9 image restoration tasks with linear computational complexity.

Transformer-based approaches have achieved superior performance in image restoration, since they can model long-term dependencies well. However, the limitation in capturing local information restricts their capacity to remove degradations. While existing approaches attempt to mitigate this issue by incorporating convolutional operations, the core component in Transformer, i.e., self-attention, which serves as a low-pass filter, could unintentionally dilute or even eliminate the acquired local patterns. In this paper, we propose HIT, a simple yet effective High-frequency Injected Transformer for image restoration. Specifically, we design a window-wise injection module (WIM), which incorporates abundant high-frequency details into the feature map, to provide reliable references for restoring high-quality images. We also develop a bidirectional interaction module (BIM) to aggregate features at different scales using a mutually reinforced paradigm, resulting in spatially and contextually improved representations. In addition, we introduce a spatial enhancement unit (SEU) to preserve essential spatial relationships that may be lost due to the computations carried out across channel dimensions in the BIM. Extensive experiments on 9 tasks (real noise, real rain streak, raindrop, motion blur, moiré, shadow, snow, haze, and low-light condition) demonstrate that HIT with linear computational complexity performs favorably against the state-of-the-art methods. The source code and pre-trained models will be available at https://github.com/joshyZhou/HIT.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes