Andrea Prenner

h-index45
2papers

2 Papers

CVMay 13, 2025
Calibration and Uncertainty for multiRater Volume Assessment in multiorgan Segmentation (CURVAS) challenge results

Meritxell Riera-Marin, Sikha O K, Julia Rodriguez-Comas et al.

Deep learning (DL) has become the dominant approach for medical image segmentation, yet ensuring the reliability and clinical applicability of these models requires addressing key challenges such as annotation variability, calibration, and uncertainty estimation. This is why we created the Calibration and Uncertainty for multiRater Volume Assessment in multiorgan Segmentation (CURVAS), which highlights the critical role of multiple annotators in establishing a more comprehensive ground truth, emphasizing that segmentation is inherently subjective and that leveraging inter-annotator variability is essential for robust model evaluation. Seven teams participated in the challenge, submitting a variety of DL models evaluated using metrics such as Dice Similarity Coefficient (DSC), Expected Calibration Error (ECE), and Continuous Ranked Probability Score (CRPS). By incorporating consensus and dissensus ground truth, we assess how DL models handle uncertainty and whether their confidence estimates align with true segmentation performance. Our findings reinforce the importance of well-calibrated models, as better calibration is strongly correlated with the quality of the results. Furthermore, we demonstrate that segmentation models trained on diverse datasets and enriched with pre-trained knowledge exhibit greater robustness, particularly in cases deviating from standard anatomical structures. Notably, the best-performing models achieved high DSC and well-calibrated uncertainty estimates. This work underscores the need for multi-annotator ground truth, thorough calibration assessments, and uncertainty-aware evaluations to develop trustworthy and clinically reliable DL-based medical image segmentation models.

IVOct 15, 2024
Prediction of Cardiovascular Risk Factors from Retinal Fundus Images using CNNs

Andrea Prenner

Early detection of cardiovascular disease risk factors is essential to alter the course of the disease. Previous studies showed that deep learning can successfully be used to detect such risk factors from retinal images. This study uses convolutional neural networks (CNNs) to predict the cardiovascular disease risk factors age, BMI, smoking status, HbA1c, systolic blood pressure, diastolic blood pressure, gender and total cholesterol from retinal images from the UK Biobank data set. By applying contrast enhancement on the retinal images in the form of Gaussian filtering and deriving predictions on individual basis through the combination of left and right retinal image predictions, an increased prediction performance could be derived for the variables age (R2 score of 0.81) and systolic blood pressure (R2 score of 0.39) compared to previous studies using retinal images from the UK Biobank data set. Further, this is the first study that tries to predict HbA1c and total cholesterol from UK Biobank retinal fundus images. For these variables the models achieved an R2 score of 0.0579 for predicting HbA1c and an R2 score of 0.0157 for predicting total cholesterol. These results show that the value of deriving predictions for these two risk factors from retinal fundus images from the UK Biobank data set is limited.