CV AIApr 14

Detecting and refurbishing ground truth errors during training of deep learning-based echocardiography segmentation models

Iman Islam, Bram Ruijsink, Andrew J. Reader, Andrew P. King

arXiv:2604.1283211.4h-index: 23

AI Analysis

For medical image segmentation practitioners, this work offers a practical method to handle label errors, though the improvement is incremental and limited to high-error scenarios.

The study examines the robustness of deep learning models to ground truth errors in echocardiography segmentation and proposes a method to detect and refurbish erroneous labels during training. The Variance of Gradients method effectively flags errors, and the refurbishment approach improves performance, especially under high-error conditions.

Deep learning-based medical image segmentation typically relies on ground truth (GT) labels obtained through manual annotation, but these can be prone to random errors or systematic biases. This study examines the robustness of deep learning models to such errors in echocardiography (echo) segmentation and evaluates a novel strategy for detecting and refurbishing erroneous labels during model training. Using the CAMUS dataset, we simulate three error types, then compare a loss-based GT label error detection method with one based on Variance of Gradients (VOG). We also propose a pseudo-labelling approach to refurbish suspected erroneous GT labels. We assess the performance of our proposed approach under varying error levels. Results show that VOG proved highly effective in flagging erroneous GT labels during training. However, a standard U-Net maintained strong performance under random label errors and moderate levels of systematic errors (up to 50%). The detection and refurbishment approach improved performance, particularly under high-error conditions.

View on arXiv PDF

Similar