CVNov 10, 2025

Spatial-Frequency Enhanced Mamba for Multi-Modal Image Fusion

Hui Sun, Long Lv, Pingping Zhang, Tongdan Tang, Feng Tian, Weibing Sun, Huchuan Lu

arXiv:2511.06593v17 citationsh-index: 13Has CodeIEEE Transactions on Image Processing

Originality Incremental advance

AI Analysis

This work addresses MMIF for applications requiring integrated image information, but it appears incremental as it builds on existing Mamba models with specific enhancements.

The paper tackles the problem of Multi-Modal Image Fusion (MMIF) by proposing SFMFusion, a framework that enhances Mamba with spatial-frequency perceptions and integrates image reconstruction as an auxiliary task, achieving better results than most state-of-the-art methods on six datasets.

Multi-Modal Image Fusion (MMIF) aims to integrate complementary image information from different modalities to produce informative images. Previous deep learning-based MMIF methods generally adopt Convolutional Neural Networks (CNNs) or Transformers for feature extraction. However, these methods deliver unsatisfactory performances due to the limited receptive field of CNNs and the high computational cost of Transformers. Recently, Mamba has demonstrated a powerful potential for modeling long-range dependencies with linear complexity, providing a promising solution to MMIF. Unfortunately, Mamba lacks full spatial and frequency perceptions, which are very important for MMIF. Moreover, employing Image Reconstruction (IR) as an auxiliary task has been proven beneficial for MMIF. However, a primary challenge is how to leverage IR efficiently and effectively. To address the above issues, we propose a novel framework named Spatial-Frequency Enhanced Mamba Fusion (SFMFusion) for MMIF. More specifically, we first propose a three-branch structure to couple MMIF and IR, which can retain complete contents from source images. Then, we propose the Spatial-Frequency Enhanced Mamba Block (SFMB), which can enhance Mamba in both spatial and frequency domains for comprehensive feature extraction. Finally, we propose the Dynamic Fusion Mamba Block (DFMB), which can be deployed across different branches for dynamic feature fusion. Extensive experiments show that our method achieves better results than most state-of-the-art methods on six MMIF datasets. The source code is available at https://github.com/SunHui1216/SFMFusion.

View on arXiv PDF Code

Similar