Gender Slopes: Counterfactual Fairness for Computer Vision Models by Attribute Manipulation
This addresses fairness issues in automated systems used in security and law enforcement, though it is incremental as it applies existing attribute manipulation methods to diagnose bias.
The paper tackled bias in commercial computer vision classifiers by using an encoder-decoder network to synthesize facial images with manipulated gender and race attributes, measuring counterfactual fairness and finding that feminine faces elicited higher scores for nurse concepts and lower scores for STEM concepts. It also reported skewed gender representations in online search results for professions, linking these to model biases.
Automated computer vision systems have been applied in many domains including security, law enforcement, and personal devices, but recent reports suggest that these systems may produce biased results, discriminating against people in certain demographic groups. Diagnosing and understanding the underlying true causes of model biases, however, are challenging tasks because modern computer vision systems rely on complex black-box models whose behaviors are hard to decode. We propose to use an encoder-decoder network developed for image attribute manipulation to synthesize facial images varying in the dimensions of gender and race while keeping other signals intact. We use these synthesized images to measure counterfactual fairness of commercial computer vision classifiers by examining the degree to which these classifiers are affected by gender and racial cues controlled in the images, e.g., feminine faces may elicit higher scores for the concept of nurse and lower scores for STEM-related concepts. We also report the skewed gender representations in an online search service on profession-related keywords, which may explain the origin of the biases encoded in the models.