CVMay 10

Spatial-Frequency Gated Swin Transformer for Remote Sensing Single-Image Super-Resolution

arXiv:2605.0968720.3
Predicted impact top 92% in CV · last 90 daysOriginality Incremental advance
AI Analysis

For remote sensing applications requiring high-resolution imagery, this work offers an incremental improvement in detail reconstruction by separating low- and high-frequency components in the transformer feed-forward network.

SFG-SwinSR improves remote sensing single-image super-resolution by replacing the standard feed-forward network in Swin2SR with a spatial-frequency gated module, achieving 45.19 dB PSNR and 0.9852 SSIM on SpaceNet.

Remote Sensing (RS) single-image super-resolution aims to reconstruct high-resolution imagery from low-resolution observations while preserving fine spatial structures. Recent Swin Transformer-based models, including Swin2SR, provide strong spatial context modeling throughshifted-window self-attention, but their feed-forward networks remain generic channel-mixing modules and do not separate low-frequency structural content from high-frequency residual detail. To address this limitation, we propose SFG-SwinSR, a Spatial-Frequency Gated Swin Transformer for single-image super-resolution in remote sensing. SFG-SwinSR modifies the original Swin2SR attention block by replacing each transformer block's standard feed-forward network with a lightweight Spatial-Frequency Gated Feed-Forward Network (SFG-FFN). The module estimates low-frequency content via a depthwise-blur branch, extracts high-frequency residuals by subtraction, refines them with a lightweight spatial branch, and adaptively injects detail through a bottleneck gate. Experiments on SpaceNet and SEN2VENμS show that SFG-SwinSR improves reconstruction quality under the evaluated settings. On SpaceNet, it achieves 45.19 dB PSNR and 0.9852 SSIM, indicating effective enhancement of high-frequency details. This demonstrates that spatial-frequency transformation within the transformer feed-forward network improves detail reconstruction in RS super-resolution.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes