Luke Whitbread

LG
3papers
4citations
Novelty42%
AI Score39

3 Papers

LGMar 1, 2022
Uncertainty categories in medical image segmentation: a study of source-related diversity

Luke Whitbread, Mark Jenkinson

Measuring uncertainties in the output of a deep learning method is useful in several ways, such as in assisting with interpretation of the outputs, helping build confidence with end users, and for improving the training and performance of the networks. Several different methods have been proposed to estimate uncertainties, including those from epistemic (relating to the model used) and aleatoric (relating to the data) sources using test-time dropout and augmentation, respectively. Not only are these uncertainty sources different, but they are governed by parameter settings (e.g., dropout rate or type and level of augmentation) that establish even more distinct uncertainty categories. This work investigates how different the uncertainties are from these categories, for magnitude and spatial pattern, to empirically address the question of whether they provide usefully distinct information that should be captured whenever uncertainties are used. We take the well characterised BraTS challenge dataset to demonstrate that there are substantial differences in both magnitude and spatial pattern of uncertainties from the different categories, and discuss the implications of these in various use cases.

12.6CVMay 19
Robust Mitigation of Age-Dependent Confounding Effects via Sample-Difficulty Decorrelation

Nikhil Cherian Kurian, Victor Caquilpan Parra, Abin Shoby et al.

Age dependent performance disparities in medical image classification often arise because age acts as a confounder, linking imaging morphology with disease prevalence. In practice, disparities can manifest as overdiagnosis at ages where disease prevalence is higher and underdiagnosis at ages where prevalence is lower, and can worsen under train test shifts in the age distribution. Conventional mitigation approaches that enforce strict age invariance may suppress diagnostically meaningful information encoded in age. We therefore propose a robust framework that mitigates the effects of age-dependent confounding by targeting spurious age linked trends rather than enforcing invariance. Following a warm-up phase, we characterize sample difficulty and model its age-dependent trends in a label-conditioned manner. We decorrelate age from dominant age difficulty trends using robust, Huber weighted affinity weights, attenuating confounding-driven shortcuts while preserving clinically meaningful, nonlinear age information. We further introduce an Age Coverage Score that scales the decorrelation penalty by minibatch age variance to ensure stable optimization under limited age diversity. Across two radiology datasets, our approach reduces age dependent true and false positive disparities with minimal AUC impact and remains robust to increasing train test age distribution shifts.

40.5LGMay 19
Worst-Group Equalized Odds Regularization for Multi-Attribute Fair Medical Image Classification

Nikhil Cherian Kurian, Victor Caquilpan Parra, Abin Shoby et al.

Diagnostic performance in medical AI varies systematically across demographic groups, yet subgroup AUC can mask clinically important disparities. At a fixed inference-time operating point, some groups may exhibit over-diagnostic behaviour, characterized by elevated true and false positive rates, while others show under-diagnostic patterns with reduced true and false positive rates. These opposing tendencies can cancel in aggregate AUCs while producing meaningful inequities in clinical decision-making. Motivated by the need to assess and mitigate such disparities at the operating point and across multiple demographic attributes simultaneously, we propose a worst-group equalized-odds margin regularizer. The proposed regularizer explicitly targets subgroup-level deviations on both the true positive and false positive sides at inference. At each update, the method identifies subgroups defined by explicit demographic attributes (e.g., age, sex, and race) that exhibit the most extreme margin deviations and applies a unified penalty, enabling fairness optimization across multiple demographic axes without requiring explicit intersectional constraints. Across two medical imaging datasets in realistic multi-label settings, our method consistently reduces disparities in Equalized Odds and Equalized Opportunity with minimal impact on AUC, preserving diagnostic performance while improving fairness.