IV CVMay 22, 2024

I2I-Mamba: Multi-modal medical image synthesis via selective state space modeling

Omer F. Atli, Bilal Kabas, Fuat Arslan, Arda C. Demirtas, Mahmut Yurt, Onat Dalmaz, Tolga Çukur

arXiv:2405.14022v624.061 citationsh-index: 33Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of capturing contextual features in medical images for improved synthesis, offering a novel architecture that could benefit medical imaging applications, though it appears incremental in building on existing SSM frameworks.

The paper tackles the problem of multi-modal medical image synthesis by introducing I2I-Mamba, a method based on state space modeling with dual-domain Mamba blocks and a spiral-scan trajectory, which outperforms state-of-the-art CNNs, transformers, and SSMs on multi-contrast MRI and MRI-CT protocols.

Multi-modal medical image synthesis involves nonlinear transformation of tissue signals between source and target modalities, where tissues exhibit contextual interactions across diverse spatial distances. As such, the utility of a network architecture in synthesis depends on its ability to express the broad set of contextual features in medical images. Convolutional neural networks (CNNs) offer high local precision at the expense of poor sensitivity to long-range context. While transformers promise to alleviate this issue, they suffer from an unfavorable trade-off between sensitivity to long- versus short-range context due to the intrinsic complexity of attention filters. To effectively capture contextual features while avoiding the complexitydriven trade-offs, here we introduce a novel multi-modal synthesis method, I2I-Mamba, based on the state space modeling (SSM) framework. Focusing on high-level representations across a hybrid residual architecture, I2I-Mamba leverages novel dual-domain Mamba (ddMamba) blocks for complementary contextual modeling in image and Fourier domains, while maintaining spatial precision with convolutional layers. Diverting from conventional raster-scan trajectories, ddMamba leverages novel SSM operators based on a spiral-scan trajectory to learn context with enhanced angular isotropy and radial coverage, and a channel-mixing layer to aggregate context across the channel dimension. Comprehensive demonstrations on multi-contrast MRI and MRI-CT protocols indicate that I2I-Mamba outperforms state-of-the-art CNNs, transformers and SSMs.

View on arXiv PDF Code

Similar