CVAIMay 27, 2025

TrustSkin: A Fairness Pipeline for Trustworthy Facial Affect Analysis Across Skin Tone

arXiv:2505.20637v12 citationsh-index: 2FG
Originality Incremental advance
AI Analysis

This work addresses fairness issues in facial affect analysis for underrepresented demographic groups, but it is incremental as it builds on existing methods for skin tone classification and fairness evaluation.

This study tackled the problem of fairness in facial affect analysis across skin tone groups by comparing two skin tone classification methods, revealing severe underrepresentation of dark skin tones (~2%) and fairness disparities in F1-score (up to 0.08) and TPR (up to 0.11).

Understanding how facial affect analysis (FAA) systems perform across different demographic groups requires reliable measurement of sensitive attributes such as ancestry, often approximated by skin tone, which itself is highly influenced by lighting conditions. This study compares two objective skin tone classification methods: the widely used Individual Typology Angle (ITA) and a perceptually grounded alternative based on Lightness ($L^*$) and Hue ($H^*$). Using AffectNet and a MobileNet-based model, we assess fairness across skin tone groups defined by each method. Results reveal a severe underrepresentation of dark skin tones ($\sim 2 \%$), alongside fairness disparities in F1-score (up to 0.08) and TPR (up to 0.11) across groups. While ITA shows limitations due to its sensitivity to lighting, the $H^*$-$L^*$ method yields more consistent subgrouping and enables clearer diagnostics through metrics such as Equal Opportunity. Grad-CAM analysis further highlights differences in model attention patterns by skin tone, suggesting variation in feature encoding. To support future mitigation efforts, we also propose a modular fairness-aware pipeline that integrates perceptual skin tone estimation, model interpretability, and fairness evaluation. These findings emphasize the relevance of skin tone measurement choices in fairness assessment and suggest that ITA-based evaluations may overlook disparities affecting darker-skinned individuals.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes