CVJun 23, 2025

Frequency-Domain Fusion Transformer for Image Inpainting

arXiv:2506.18437v11 citationsh-index: 1
Originality Incremental advance
AI Analysis

This work solves the problem of restoring missing image regions with complex textures for computer vision applications, representing an incremental improvement over existing Transformer approaches.

The paper tackled image inpainting by addressing the limitations of Transformer-based methods in preserving high-frequency details and reducing computational costs, resulting in improved quality with better detail retention as demonstrated experimentally.

Image inpainting plays a vital role in restoring missing image regions and supporting high-level vision tasks, but traditional methods struggle with complex textures and large occlusions. Although Transformer-based approaches have demonstrated strong global modeling capabilities, they often fail to preserve high-frequency details due to the low-pass nature of self-attention and suffer from high computational costs. To address these challenges, this paper proposes a Transformer-based image inpainting method incorporating frequency-domain fusion. Specifically, an attention mechanism combining wavelet transform and Gabor filtering is introduced to enhance multi-scale structural modeling and detail preservation. Additionally, a learnable frequency-domain filter based on the fast Fourier transform is designed to replace the feedforward network, enabling adaptive noise suppression and detail retention. The model adopts a four-level encoder-decoder structure and is guided by a novel loss strategy to balance global semantics and fine details. Experimental results demonstrate that the proposed method effectively improves the quality of image inpainting by preserving more high-frequency information.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes