IVCVNov 26, 2024

Reliability of deep learning models for anatomical landmark detection: The role of inter-rater variability

arXiv:2411.17850v12 citationsh-index: 31Medical Imaging
Originality Incremental advance
AI Analysis

This addresses the reliability issue for clinical deployment of automated landmark detection in medical imaging, though it is incremental in focusing on data annotation strategies.

The study tackled the problem of inter-rater variability in training deep learning models for anatomical landmark detection, finding that different annotation-fusion strategies can improve performance and reliability, with a novel Weighted Coordinate Variance metric introduced to quantify uncertainty.

Automated detection of anatomical landmarks plays a crucial role in many diagnostic and surgical applications. Progresses in deep learning (DL) methods have resulted in significant performance enhancement in tasks related to anatomical landmark detection. While current research focuses on accurately localizing these landmarks in medical scans, the importance of inter-rater annotation variability in building DL models is often overlooked. Understanding how inter-rater variability impacts the performance and reliability of the resulting DL algorithms, which are crucial for clinical deployment, can inform the improvement of training data construction and boost DL models' outcomes. In this paper, we conducted a thorough study of different annotation-fusion strategies to preserve inter-rater variability in DL models for anatomical landmark detection, aiming to boost the performance and reliability of the resulting algorithms. Additionally, we explored the characteristics and reliability of four metrics, including a novel Weighted Coordinate Variance metric to quantify landmark detection uncertainty/inter-rater variability. Our research highlights the crucial connection between inter-rater variability, DL-models performances, and uncertainty, revealing how different approaches for multi-rater landmark annotation fusion can influence these factors.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes