SAMba-UNet: SAM2-Mamba UNet for Cardiac MRI in Medical Robotic Perception
This addresses segmentation challenges for cardiac MRI in medical robotic systems, though it appears incremental as it builds on existing models like SAM2 and Mamba.
The paper tackles automated cardiac MRI segmentation by proposing SAMba-UNet, a dual-encoder architecture combining SAM2, Mamba, and UNet, which achieves a Dice score of 0.9103 and HD95 of 1.0859 mm on the ACDC benchmark, improving boundary localization for structures like the right ventricle.
To address complex pathological feature extraction in automated cardiac MRI segmentation, we propose SAMba-UNet, a novel dual-encoder architecture that synergistically combines the vision foundation model SAM2, the linear-complexity state-space model Mamba, and the classical UNet to achieve cross-modal collaborative feature learning; to overcome domain shifts between natural images and medical scans, we introduce a Dynamic Feature Fusion Refiner that employs multi-scale pooling and channel-spatial dual-path calibration to strengthen small-lesion and fine-structure representation, and we design a Heterogeneous Omni-Attention Convergence Module (HOACM) that fuses SAM2's local positional semantics with Mamba's long-range dependency modeling via global contextual attention and branch-selective emphasis, yielding substantial gains in both global consistency and boundary precision-on the ACDC cardiac MRI benchmark, SAMba-UNet attains a Dice of 0.9103 and HD95 of 1.0859 mm, notably improving boundary localization for challenging structures like the right ventricle, and its robust, high-fidelity segmentation maps are directly applicable as a perception module within intelligent medical and surgical robotic systems to support preoperative planning, intraoperative navigation, and postoperative complication screening; the code will be open-sourced to facilitate clinical translation and further validation.