Social-Group-Agnostic Word Embedding Debiasing via the Stereotype Content Model
This addresses bias mitigation in NLP for various social groups, offering a more practical approach, though it is incremental as it builds on existing debiasing methods.
The paper tackles the problem of word embedding debiasing requiring social-group-specific word pairs, which is impractical for understudied groups, by proposing a social-group-agnostic method using the Stereotype Content Model (SCM) with warmth and competence term pairs; it finds that SCM-based debiasing performs comparably to group-specific methods across gender, race, and age.
Existing word embedding debiasing methods require social-group-specific word pairs (e.g., "man"-"woman") for each social attribute (e.g., gender), which cannot be used to mitigate bias for other social groups, making these methods impractical or costly to incorporate understudied social groups in debiasing. We propose that the Stereotype Content Model (SCM), a theoretical framework developed in social psychology for understanding the content of stereotypes, which structures stereotype content along two psychological dimensions - "warmth" and "competence" - can help debiasing efforts to become social-group-agnostic by capturing the underlying connection between bias and stereotypes. Using only pairs of terms for warmth (e.g., "genuine"-"fake") and competence (e.g.,"smart"-"stupid"), we perform debiasing with established methods and find that, across gender, race, and age, SCM-based debiasing performs comparably to group-specific debiasing