IV CVJul 19, 2024

De-LightSAM: Modality-Decoupled Lightweight SAM for Generalizable Medical Segmentation

Qing Xu, Jiaxuan Li, Xiangjian He, Chenxin Li, Fiseha B. Tesem, Wenting Duan, Zhen Chen, Rong Qu, Jonathan M. Garibaldi, Chang Wen Chen

arXiv:2407.14153v517.522 citationsh-index: 10Has Code

Originality Incremental advance

AI Analysis

This work provides a more efficient and generalizable solution for medical image segmentation across diverse modalities, though it appears incremental as it builds directly on SAM.

The paper tackles the problem of adapting the Segment Anything Model (SAM) for medical image segmentation by addressing its computational cost, manual prompt requirements, and modality conflicts, resulting in De-LightSAM which outperforms state-of-the-art methods with only 2.0% of SAM-H's parameters.

The universality of deep neural networks across different modalities and their generalization capabilities to unseen domains play an essential role in medical image segmentation. The recent segment anything model (SAM) has demonstrated strong adaptability across diverse natural scenarios. However, the huge computational costs, demand for manual annotations as prompts and conflict-prone decoding process of SAM degrade its generalization capabilities in medical scenarios. To address these limitations, we propose a modality-decoupled lightweight SAM for domain-generalized medical image segmentation, named De-LightSAM. Specifically, we first devise a lightweight domain-controllable image encoder (DC-Encoder) that produces discriminative visual features for diverse modalities. Further, we introduce the self-patch prompt generator (SP-Generator) to automatically generate high-quality dense prompt embeddings for guiding segmentation decoding. Finally, we design the query-decoupled modality decoder (QM-Decoder) that leverages a one-to-one strategy to provide an independent decoding channel for every modality, preventing mutual knowledge interference of different modalities. Moreover, we design a multi-modal decoupled knowledge distillation (MDKD) strategy to leverage robust common knowledge to complement domain-specific medical feature representations. Extensive experiments indicate that De-LightSAM outperforms state-of-the-arts in diverse medical imaging segmentation tasks, displaying superior modality universality and generalization capabilities. Especially, De-LightSAM uses only 2.0% parameters compared to SAM-H. The source code is available at https://github.com/xq141839/De-LightSAM.

View on arXiv PDF Code

Similar