RouteWinFormer: A Route-Window Transformer for Middle-range Attention in Image Restoration
This addresses the problem of high computational overhead in image restoration for researchers and practitioners, though it is incremental as it builds on existing Transformer approaches.
The paper tackles the computational inefficiency of long-range attention in image restoration by proposing RouteWinFormer, a window-based Transformer that uses middle-range attention, and it outperforms state-of-the-art methods on 9 datasets.
Transformer models have recently garnered significant attention in image restoration due to their ability to capture long-range pixel dependencies. However, long-range attention often results in computational overhead without practical necessity, as degradation and context are typically localized. Normalized average attention distance across various degradation datasets shows that middle-range attention is enough for image restoration. Building on this insight, we propose RouteWinFormer, a novel window-based Transformer that models middle-range context for image restoration. RouteWinFormer incorporates Route-Windows Attnetion Module, which dynamically selects relevant nearby windows based on regional similarity for attention aggregation, extending the receptive field to a mid-range size efficiently. In addition, we introduce Multi-Scale Structure Regularization during training, enabling the sub-scale of the U-shaped network to focus on structural information, while the original-scale learns degradation patterns based on generalized image structure priors. Extensive experiments demonstrate that RouteWinFormer outperforms state-of-the-art methods across 9 datasets in various image restoration tasks.