CV IVApr 11, 2024

FusionMamba: Efficient Remote Sensing Image Fusion with State Space Model

Siran Peng, Xiangyu Zhu, Haoyu Deng, Liang-Jian Deng, Zhen Lei

arXiv:2404.07932v319.845 citationsh-index: 31Has CodeIEEE Trans Geosci Remote Sens

Originality Highly original

AI Analysis

This addresses the need for efficient and accurate image fusion in remote sensing applications, offering a novel approach that improves upon existing methods.

The paper tackles the problem of remote sensing image fusion by proposing FusionMamba, a method that uses state space models to efficiently integrate spatial and spectral features, achieving state-of-the-art performance across six datasets.

Remote sensing image fusion aims to generate a high-resolution multi/hyper-spectral image by combining a high-resolution image with limited spectral data and a low-resolution image rich in spectral information. Current deep learning (DL) methods typically employ convolutional neural networks (CNNs) or Transformers for feature extraction and information integration. While CNNs are efficient, their limited receptive fields restrict their ability to capture global context. Transformers excel at learning global information but are computationally expensive. Recent advancements in the state space model (SSM), particularly Mamba, present a promising alternative by enabling global perception with low complexity. However, the potential of SSM for information integration remains largely unexplored. Therefore, we propose FusionMamba, an innovative method for efficient remote sensing image fusion. Our contributions are twofold. First, to effectively merge spatial and spectral features, we expand the single-input Mamba block to accommodate dual inputs, creating the FusionMamba block, which serves as a plug-and-play solution for information integration. Second, we incorporate Mamba and FusionMamba blocks into an interpretable network architecture tailored for remote sensing image fusion. Our designs utilize two U-shaped network branches, each primarily composed of four-directional Mamba blocks, to extract spatial and spectral features separately and hierarchically. The resulting feature maps are sufficiently merged in an auxiliary network branch constructed with FusionMamba blocks. Furthermore, we improve the representation of spectral information through an enhanced channel attention module. Quantitative and qualitative valuation results across six datasets demonstrate that our method achieves SOTA performance. The code is available at https://github.com/PSRben/FusionMamba.

View on arXiv PDF Code

Similar