CV IVMar 24

Harnessing Lightweight Transformer with Contextual Synergic Enhancement for Efficient 3D Medical Image Segmentation

Xinyu Liu, Zhen Chen, Wuyang Li, Chenxin Li, Yixuan Yuan

arXiv:2603.2339074.41 citationsh-index: 18Has Code

Predicted impact top 37% in CV · last 90 daysOriginality Incremental advance

AI Analysis

This addresses efficiency bottlenecks for medical imaging applications, though it is incremental in combining lightweight design with semi-supervised learning.

The paper tackles the high computational and data requirements of transformers in 3D medical image segmentation by proposing Light-UNETR with a lightweight transformer and contextual synergic enhancement, achieving a 1.43% Jaccard improvement with 10% labeled data while reducing FLOPs by 90.8% and parameters by 85.8%.

Transformers have shown remarkable performance in 3D medical image segmentation, but their high computational requirements and need for large amounts of labeled data limit their applicability. To address these challenges, we consider two crucial aspects: model efficiency and data efficiency. Specifically, we propose Light-UNETR, a lightweight transformer designed to achieve model efficiency. Light-UNETR features a Lightweight Dimension Reductive Attention (LIDR) module, which reduces spatial and channel dimensions while capturing both global and local features via multi-branch attention. Additionally, we introduce a Compact Gated Linear Unit (CGLU) to selectively control channel interaction with minimal parameters. Furthermore, we introduce a Contextual Synergic Enhancement (CSE) learning strategy, which aims to boost the data efficiency of Transformers. It first leverages the extrinsic contextual information to support the learning of unlabeled data with Attention-Guided Replacement, then applies Spatial Masking Consistency that utilizes intrinsic contextual information to enhance the spatial context reasoning for unlabeled data. Extensive experiments on various benchmarks demonstrate the superiority of our approach in both performance and efficiency. For example, with only 10% labeled data on the Left Atrial Segmentation dataset, our method surpasses BCP by 1.43% Jaccard while drastically reducing the FLOPs by 90.8% and parameters by 85.8%. Code is released at https://github.com/CUHK-AIM-Group/Light-UNETR.

View on arXiv PDF Code

Similar