DSVM-UNet : Enhancing VM-UNet with Dual Self-distillation for Medical Image Segmentation
This work improves medical image segmentation for healthcare applications, but it is incremental as it builds on existing VM-UNet with a distillation-based enhancement.
The paper tackles medical image segmentation by enhancing VM-UNet with dual self-distillation, achieving state-of-the-art performance on ISIC2017, ISIC2018, and Synapse benchmarks while maintaining computational efficiency.
Vision Mamba models have been extensively researched in various fields, which address the limitations of previous models by effectively managing long-range dependencies with a linear-time overhead. Several prospective studies have further designed Vision Mamba based on UNet(VM-UNet) for medical image segmentation. These approaches primarily focus on optimizing architectural designs by creating more complex structures to enhance the model's ability to perceive semantic features. In this paper, we propose a simple yet effective approach to improve the model by Dual Self-distillation for VM-UNet (DSVM-UNet) without any complex architectural designs. To achieve this goal, we develop double self-distillation methods to align the features at both the global and local levels. Extensive experiments conducted on the ISIC2017, ISIC2018, and Synapse benchmarks demonstrate that our approach achieves state-of-the-art performance while maintaining computational efficiency. Code is available at https://github.com/RoryShao/DSVM-UNet.git.