Colorimeter-Supervised Skin Tone Estimation from Dermatoscopic Images for Fairness Auditing
This addresses the lack of reliable skin-tone annotations in public dermatoscopy datasets, enabling bias auditing for clinical decision support systems, though it is incremental as it builds on existing methods for skin-tone estimation.
The paper tackles the problem of fairness auditing in neural-network-based dermatoscopic diagnosis by developing neural networks to predict Fitzpatrick skin type and Individual Typology Angle from images, validated against colorimeter measurements, achieving high agreement comparable to human annotations and outperforming pixel-averaging approaches.
Neural-network-based diagnosis from dermatoscopic images is increasingly used for clinical decision support, yet studies report performance disparities across skin tones. Fairness auditing of these models is limited by the lack of reliable skin-tone annotations in public dermatoscopy datasets. We address this gap with neural networks that predict Fitzpatrick skin type via ordinal regression and the Individual Typology Angle (ITA) via color regression, using in-person Fitzpatrick labels and colorimeter measurements as targets. We further leverage extensive pretraining on synthetic and real dermatoscopic and clinical images. The Fitzpatrick model achieves agreement comparable to human crowdsourced annotations, and ITA predictions show high concordance with colorimeter-derived ITA, substantially outperforming pixel-averaging approaches. Applying these estimators to ISIC 2020 and MILK10k, we find that fewer than 1% of subjects belong to Fitzpatrick types V and VI. We release code and pretrained models as an open-source tool for rapid skin-tone annotation and bias auditing. This is, to our knowledge, the first dermatoscopic skin-tone estimation neural network validated against colorimeter measurements, and it supports growing evidence of clinically relevant performance gaps across skin-tone groups.