CVSep 19, 2025

UniMRSeg: Unified Modality-Relax Segmentation via Hierarchical Self-Supervised Compensation

Xiaoqi Zhao, Youwei Pang, Chenyang Yu, Lihe Zhang, Huchuan Lu, Shijian Lu, Georges El Fakhri, Xiaofeng Liu

arXiv:2509.16170v111.84 citationsh-index: 32Has Code

Originality Incremental advance

AI Analysis

This addresses deployment challenges in medical and computer vision applications by reducing costs and improving robustness, though it is incremental in building on existing modality-gap solutions.

The paper tackles the problem of multi-modal image segmentation performance degradation due to incomplete or corrupted modalities by proposing UniMRSeg, a unified network with hierarchical self-supervised compensation, which significantly outperforms state-of-the-art methods across diverse missing modality scenarios in tasks like brain tumor and semantic segmentation.

Multi-modal image segmentation faces real-world deployment challenges from incomplete/corrupted modalities degrading performance. While existing methods address training-inference modality gaps via specialized per-combination models, they introduce high deployment costs by requiring exhaustive model subsets and model-modality matching. In this work, we propose a unified modality-relax segmentation network (UniMRSeg) through hierarchical self-supervised compensation (HSSC). Our approach hierarchically bridges representation gaps between complete and incomplete modalities across input, feature and output levels. % First, we adopt modality reconstruction with the hybrid shuffled-masking augmentation, encouraging the model to learn the intrinsic modality characteristics and generate meaningful representations for missing modalities through cross-modal fusion. % Next, modality-invariant contrastive learning implicitly compensates the feature space distance among incomplete-complete modality pairs. Furthermore, the proposed lightweight reverse attention adapter explicitly compensates for the weak perceptual semantics in the frozen encoder. Last, UniMRSeg is fine-tuned under the hybrid consistency constraint to ensure stable prediction under all modality combinations without large performance fluctuations. Without bells and whistles, UniMRSeg significantly outperforms the state-of-the-art methods under diverse missing modality scenarios on MRI-based brain tumor segmentation, RGB-D semantic segmentation, RGB-D/T salient object segmentation. The code will be released at https://github.com/Xiaoqi-Zhao-DLUT/UniMRSeg.

View on arXiv PDF Code

Similar