CVMay 28
Fairness Beyond Demographics: Optimizing Performance Across Appearance-Based Hidden Cohorts in Medical ImagingMilad Masroor, Cuong Nguyen, Kevin Wells et al.
Medical image analysis models can exhibit performance disparities across patient subgroups, threatening clinical safety and fairness. Existing methods typically address this issue by optimizing accuracy and fairness metrics for visible demographic attributes (e.g., sex or age) considered in isolation. This strategy not only overlooks potentially more informative latent stratifications, which may reveal deeper sources of model error and inequity, but also fails to scale when multiple demographic attributes are considered simultaneously due to the resulting sparsity of training data within each subgroup. We deal with these issues by introducing the label-free hidden-cohort fairness (LHCF) training paradigm that instead of maximizing fairness over visible demographic attributes, it optimizes fairness across latent subpopulations discovered from image appearance. By clustering images into K appearance-based cohorts and applying fairness optimization over them, LHCF uncovers underlying sources of model error and avoids the combinatorial sparsity of multi-demographic attributes, reducing disparities across both single and multiple demographic attributes. We demonstrate on our proposed fairness benchmark, HIDFairBench, that LHCF provides state-of-the-art fairness results on single and multiple demographic attributes, despite never using demographic labels for training. Our results position hidden-cohort fairness as a practical, scalable, and robust alternative to demographic-based fairness optimization for trustworthy medical image analysis.
LGApr 28
People-Centred Medical Image AnalysisZheng Zhang, Milad Masroor, Cuong Nguyen et al.
Recent advances in data-centric medical AI have produced highly accurate diagnostic systems, but the emphasis on data curation and performance metrics has not translated into widespread clinical adoption. We conjecture that this limited uptake stems from insufficient attention dedicated to the optimisation of fair performance across diverse patient populations and to workflow integration: performance biases can create regulatory barriers, and poorly integrated automation can disrupt clinical routines, degrade the quality of human-AI collaboration, and reduce clinicians' willingness to adopt AI tools. Prior work on workflow integration (e.g., Learning to Defer (L2D) and Learning to Complement (L2C)) and AI fairness has typically examined these challenges in isolation, overlooking their natural interdependence and the practical constraints of clinical environments, such as restricted clinician availability. We propose People-Centred Medical Image Analysis (PecMan), a human-AI framework that jointly optimises fairness, diagnostic accuracy, and workflow effectiveness through a dynamic gating mechanism that assigns cases to AI, clinicians, or both under clinician workload constraints. We also introduce the Fairness and Human-Centred AI (FairHAI) benchmark for evaluating trade-offs between accuracy, fairness, and clinician workload. Experiments using this benchmark show that PecMan consistently outperforms existing methods, paving the way for more trustworthy and clinically viable AI systems. Code will be available upon paper acceptance.
CVNov 18, 2024
Fair Distillation: Teaching Fairness from Biased Teachers in Medical ImagingMilad Masroor, Tahir Hassan, Yu Tian et al.
Deep learning has achieved remarkable success in image classification and segmentation tasks. However, fairness concerns persist, as models often exhibit biases that disproportionately affect demographic groups defined by sensitive attributes such as race, gender, or age. Existing bias-mitigation techniques, including Subgroup Re-balancing, Adversarial Training, and Domain Generalization, aim to balance accuracy across demographic groups, but often fail to simultaneously improve overall accuracy, group-specific accuracy, and fairness due to conflicts among these interdependent objectives. We propose the Fair Distillation (FairDi) method, a novel fairness approach that decomposes these objectives by leveraging biased ``teacher'' models, each optimized for a specific demographic group. These teacher models then guide the training of a unified ``student'' model, which distills their knowledge to maximize overall and group-specific accuracies, while minimizing inter-group disparities. Experiments on medical imaging datasets show that FairDi achieves significant gains in both overall and group-specific accuracy, along with improved fairness, compared to existing methods. FairDi is adaptable to various medical tasks, such as classification and segmentation, and provides an effective solution for equitable model performance.