ReCo-KD: Region- and Context-Aware Knowledge Distillation for Efficient 3D Medical Image Segmentation
This work addresses the deployment and speed constraints for 3D medical image segmentation in clinical settings, offering a practical solution for clinics with limited resources, though it is incremental as it builds on existing knowledge distillation methods.
The paper tackled the problem of deploying accurate 3D medical image segmentation models in clinics with limited computing resources by proposing ReCo-KD, a knowledge distillation framework that transfers fine-grained anatomical detail and long-range contextual information from a high-capacity teacher to a compact student network, resulting in a lightweight model that attains accuracy close to the teacher while markedly reducing parameters and inference latency.
Accurate 3D medical image segmentation is vital for diagnosis and treatment planning, but state-of-the-art models are often too large for clinics with limited computing resources. Lightweight architectures typically suffer significant performance loss. To address these deployment and speed constraints, we propose Region- and Context-aware Knowledge Distillation (ReCo-KD), a training-only framework that transfers both fine-grained anatomical detail and long-range contextual information from a high-capacity teacher to a compact student network. The framework integrates Multi-Scale Structure-Aware Region Distillation (MS-SARD), which applies class-aware masks and scale-normalized weighting to emphasize small but clinically important regions, and Multi-Scale Context Alignment (MS-CA), which aligns teacher-student affinity patterns across feature levels. Implemented on nnU-Net in a backbone-agnostic manner, ReCo-KD requires no custom student design and is easily adapted to other architectures. Experiments on multiple public 3D medical segmentation datasets and a challenging aggregated dataset show that the distilled lightweight model attains accuracy close to the teacher while markedly reducing parameters and inference latency, underscoring its practicality for clinical deployment.