IVAICVNov 27, 2024

HAAT: Hybrid Attention Aggregation Transformer for Image Super-Resolution

arXiv:2411.18003v32 citationsh-index: 3Other Conferences
Originality Incremental advance
AI Analysis

This work improves image super-resolution for computer vision applications, but it appears incremental as it builds upon existing transformer-based approaches.

The paper tackles the problem of image super-resolution by addressing limitations in existing Swin-transformer-based models, such as restricted self-attention and ignored cross-channel information, resulting in a novel model called HAAT that surpasses state-of-the-art methods on benchmark datasets.

In the research area of image super-resolution, Swin-transformer-based models are favored for their global spatial modeling and shifting window attention mechanism. However, existing methods often limit self-attention to non overlapping windows to cut costs and ignore the useful information that exists across channels. To address this issue, this paper introduces a novel model, the Hybrid Attention Aggregation Transformer (HAAT), designed to better leverage feature information. HAAT is constructed by integrating Swin-Dense-Residual-Connected Blocks (SDRCB) with Hybrid Grid Attention Blocks (HGAB). SDRCB expands the receptive field while maintaining a streamlined architecture, resulting in enhanced performance. HGAB incorporates channel attention, sparse attention, and window attention to improve nonlocal feature fusion and achieve more visually compelling results. Experimental evaluations demonstrate that HAAT surpasses state-of-the-art methods on benchmark datasets. Keywords: Image super-resolution, Computer vision, Attention mechanism, Transformer

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes