CVApr 14, 2025

Global and Local Mamba Network for Multi-Modality Medical Image Super-Resolution

arXiv:2504.10105v18 citationsh-index: 12Pattern Recognition
Originality Incremental advance
AI Analysis

This work addresses super-resolution for medical imaging, offering a more efficient method compared to existing approaches, though it appears incremental as it builds on State Space Models like Mamba.

The paper tackles the problem of multi-modality medical image super-resolution by proposing a global and local Mamba network (GLMamba) to efficiently model long-range dependencies and local details, achieving improved performance with linear computational complexity.

Convolutional neural networks and Transformer have made significant progresses in multi-modality medical image super-resolution. However, these methods either have a fixed receptive field for local learning or significant computational burdens for global learning, limiting the super-resolution performance. To solve this problem, State Space Models, notably Mamba, is introduced to efficiently model long-range dependencies in images with linear computational complexity. Relying on the Mamba and the fact that low-resolution images rely on global information to compensate for missing details, while high-resolution reference images need to provide more local details for accurate super-resolution, we propose a global and local Mamba network (GLMamba) for multi-modality medical image super-resolution. To be specific, our GLMamba is a two-branch network equipped with a global Mamba branch and a local Mamba branch. The global Mamba branch captures long-range relationships in low-resolution inputs, and the local Mamba branch focuses more on short-range details in high-resolution reference images. We also use the deform block to adaptively extract features of both branches to enhance the representation ability. A modulator is designed to further enhance deformable features in both global and local Mamba blocks. To fully integrate the reference image for low-resolution image super-resolution, we further develop a multi-modality feature fusion block to adaptively fuse features by considering similarities, differences, and complementary aspects between modalities. In addition, a contrastive edge loss (CELoss) is developed for sufficient enhancement of edge textures and contrast in medical images.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes