LG AI AP MEMar 30

Fairboard: a quantitative framework for equity assessment of healthcare models

James K. Ruffle, Samia Mohinta, Chris Foulon, Mohamad Zeina, Zicheng Wang, Sebastian Brandner, Harpreet Hyare, Parashkev Nachev

arXiv:2604.0965654.2h-index: 5Has Code

AI Analysis

For medical AI practitioners, this work provides a framework and tool to assess model fairness across patient subgroups, addressing a critical gap in equity evaluation for FDA-authorized devices.

This paper evaluates the equity of 18 brain tumor segmentation models across 648 glioma patients, finding that patient identity explains more performance variance than model choice and that clinical factors predict accuracy more strongly than architecture. The authors release Fairboard, an open-source dashboard for equitable model monitoring.

Despite there now being more than 1,000 FDA-authorised AI medical devices, formal equity assessments -- whether model performance is uniform across patient subgroups -- are rare. Here, we evaluate the equity of 18 open-source brain tumour segmentation models across 648 glioma patients from two independent datasets (n = 11,664 model inferences) along distinct univariate, Bayesian multivariate, spatial, and representational dimensions. We find that patient identity consistently explains more performance variance than model choice, with clinical factors, including molecular diagnosis, tumour grade, and extent of resection, predicting segmentation accuracy more strongly than model architecture. A voxel-wise spatial meta-analysis identifies neuroanatomically localised biases that are compartment-specific yet often consistent across models. Within a high-dimensional latent space of lesion masks and clinic-demographic features, model performance clusters significantly, indicating that the patient feature space contains axes of algorithmic vulnerability. Although newer models tend toward greater equity, none provide a formal fairness guarantee. Lastly, we release Fairboard, an open-source, no-code dashboard that lowers barriers to equitable model monitoring in medical imaging.

View on arXiv PDF

Similar