Predict Patient Self-reported Race from Skin Histological Images
This work addresses bias risks in computational pathology, which is crucial for equitable AI deployment in healthcare, though it is incremental in its focus on data curation and feature analysis.
The study tackled the problem of deep learning models learning unintended demographic biases by predicting self-reported race from dermatopathology slides, finding that models achieved AUCs of 0.799 for White and 0.762 for Black groups, with overall performance at 0.663, and identified the epidermis as a key predictive feature.
Artificial Intelligence (AI) has demonstrated success in computational pathology (CPath) for disease detection, biomarker classification, and prognosis prediction. However, its potential to learn unintended demographic biases, particularly those related to social determinants of health, remains understudied. This study investigates whether deep learning models can predict self-reported race from digitized dermatopathology slides and identifies potential morphological shortcuts. Using a multisite dataset with a racially diverse population, we apply an attention-based mechanism to uncover race-associated morphological features. After evaluating three dataset curation strategies to control for confounding factors, the final experiment showed that White and Black demographic groups retained high prediction performance (AUC: 0.799, 0.762), while overall performance dropped to 0.663. Attention analysis revealed the epidermis as a key predictive feature, with significant performance declines when these regions were removed. These findings highlight the need for careful data curation and bias mitigation to ensure equitable AI deployment in pathology. Code available at: https://github.com/sinai-computational-pathology/CPath_SAIF.