IRLGMLJan 27, 2019

Bias in Bios: A Case Study of Semantic Representation Bias in a High-Stakes Setting

arXiv:1901.09451v1567 citations
Originality Synthesis-oriented
AI Analysis

This work addresses bias in high-stakes machine learning applications, such as job allocation, which can negatively impact people's lives, but it is incremental as it builds on existing bias analysis methods.

The study investigated gender bias in occupation classification by analyzing how explicit gender indicators in semantic representations affect classification accuracy and allocation harms, finding that differences in true positive rates between genders correlate with existing gender imbalances in occupations.

We present a large-scale study of gender bias in occupation classification, a task where the use of machine learning may lead to negative outcomes on peoples' lives. We analyze the potential allocation harms that can result from semantic representation bias. To do so, we study the impact on occupation classification of including explicit gender indicators---such as first names and pronouns---in different semantic representations of online biographies. Additionally, we quantify the bias that remains when these indicators are "scrubbed," and describe proxy behavior that occurs in the absence of explicit gender indicators. As we demonstrate, differences in true positive rates between genders are correlated with existing gender imbalances in occupations, which may compound these imbalances.

Code Implementations5 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes