The Missing Piece: A Case for Pre-Training in 3D Medical Object Detection
This work addresses a crucial component of accurate computer-aided diagnosis for medical imaging, though it is incremental as it applies existing pre-training methods to an underexplored area.
The authors tackled the problem of 3D medical object detection by systematically studying how existing pre-training methods integrate into state-of-the-art detection architectures, finding that pre-training consistently improves performance across tasks and datasets, with reconstruction-based self-supervised pre-training outperforming supervised pre-training.
Large-scale pre-training holds the promise to advance 3D medical object detection, a crucial component of accurate computer-aided diagnosis. Yet, it remains underexplored compared to segmentation, where pre-training has already demonstrated significant benefits. Existing pre-training approaches for 3D object detection rely on 2D medical data or natural image pre-training, failing to fully leverage 3D volumetric information. In this work, we present the first systematic study of how existing pre-training methods can be integrated into state-of-the-art detection architectures, covering both CNNs and Transformers. Our results show that pre-training consistently improves detection performance across various tasks and datasets. Notably, reconstruction-based self-supervised pre-training outperforms supervised pre-training, while contrastive pre-training provides no clear benefit for 3D medical object detection. Our code is publicly available at: https://github.com/MIC-DKFZ/nnDetection-finetuning.