Latent Feature Alignment: Discovering Biased and Interpretable Subpopulations in Face Recognition Models
This addresses bias auditing in face recognition for practitioners, enabling identification of biased subpopulations without predefined labels, though it is incremental as it builds on existing clustering methods.
The paper tackled the problem of systematic biases in face recognition models by introducing Latent Feature Alignment (LFA), an attribute-label-free algorithm that identifies biased and interpretable subpopulations, outperforming k-means and nearest-neighbor search in intra-group semantic coherence across four models and two benchmarks.
Modern face recognition models achieve high overall accuracy but continue to exhibit systematic biases that disproportionately affect certain subpopulations. Conventional bias evaluation frameworks rely on labeled attributes to form subpopulations, which are expensive to obtain and limited to predefined categories. We introduce Latent Feature Alignment (LFA), an attribute-label-free algorithm that uses latent directions to identify subpopulations. This yields two main benefits over standard clustering: (i) semantically coherent grouping, where faces sharing common attributes are grouped together more reliably than by proximity-based methods, and (ii) discovery of interpretable directions, which correspond to semantic attributes such as age, ethnicity, or attire. Across four state-of-the-art recognition models (ArcFace, CosFace, ElasticFace, PartialFC) and two benchmarks (RFW, CelebA), LFA consistently outperforms k-means and nearest-neighbor search in intra-group semantic coherence, while uncovering interpretable latent directions aligned with demographic and contextual attributes. These results position LFA as a practical method for representation auditing of face recognition models, enabling practitioners to identify and interpret biased subpopulations without predefined attribute annotations.