Volumetric landmark detection with a multi-scale shift equivariant neural network
This work improves landmark detection accuracy for medical imaging applications, though it is incremental as it builds on existing deep learning methods with specific architectural enhancements.
The paper tackled anatomical landmark detection in volumetric CT scans by proposing a multi-scale, shift-equivariant neural network that addresses GPU memory constraints, achieving a state-of-the-art mean Euclidean distance error of 2.81mm for carotid artery bifurcations detection on 263 CT volumes.
Deep neural networks yield promising results in a wide range of computer vision applications, including landmark detection. A major challenge for accurate anatomical landmark detection in volumetric images such as clinical CT scans is that large-scale data often constrain the capacity of the employed neural network architecture due to GPU memory limitations, which in turn can limit the precision of the output. We propose a multi-scale, end-to-end deep learning method that achieves fast and memory-efficient landmark detection in 3D images. Our architecture consists of blocks of shift-equivariant networks, each of which performs landmark detection at a different spatial scale. These blocks are connected from coarse to fine-scale, with differentiable resampling layers, so that all levels can be trained together. We also present a noise injection strategy that increases the robustness of the model and allows us to quantify uncertainty at test time. We evaluate our method for carotid artery bifurcations detection on 263 CT volumes and achieve a better than state-of-the-art accuracy with mean Euclidean distance error of 2.81mm.