CVMar 15, 2022

Enriched CNN-Transformer Feature Aggregation Networks for Super-Resolution

arXiv:2203.07682v3100 citationsh-index: 14
Originality Incremental advance
AI Analysis

This work addresses a specific bottleneck in image super-resolution for computer vision applications, representing an incremental improvement over existing methods.

The paper tackles the problem of shortsightedness in transformer-based super-resolution methods by introducing a hybrid network that aggregates local CNN features with long-range transformer dependencies, achieving state-of-the-art results on multiple benchmark datasets.

Recent transformer-based super-resolution (SR) methods have achieved promising results against conventional CNN-based methods. However, these approaches suffer from essential shortsightedness created by only utilizing the standard self-attention-based reasoning. In this paper, we introduce an effective hybrid SR network to aggregate enriched features, including local features from CNNs and long-range multi-scale dependencies captured by transformers. Specifically, our network comprises transformer and convolutional branches, which synergetically complement each representation during the restoration procedure. Furthermore, we propose a cross-scale token attention module, allowing the transformer branch to exploit the informative relationships among tokens across different scales efficiently. Our proposed method achieves state-of-the-art SR results on numerous benchmark datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes