CLAIJun 21, 2022

Don't Forget About Pronouns: Removing Gender Bias in Language Models Without Losing Factual Gender Information

arXiv:2206.10744v1631 citations
Originality Incremental advance
AI Analysis

This work addresses gender bias in language models, which is a critical issue for fairness in AI applications, though it is incremental as it builds on existing probing and filtering techniques.

The paper tackled the problem of gender bias in language models by disentangling factual gender information from stereotypical bias in embeddings, achieving a reduction in bias for gender-neutral profession names without significantly harming language modeling performance.

The representations in large language models contain multiple types of gender information. We focus on two types of such signals in English texts: factual gender information, which is a grammatical or semantic property, and gender bias, which is the correlation between a word and specific gender. We can disentangle the model's embeddings and identify components encoding both types of information with probing. We aim to diminish the stereotypical bias in the representations while preserving the factual gender signal. Our filtering method shows that it is possible to decrease the bias of gender-neutral profession names without significant deterioration of language modeling capabilities. The findings can be applied to language generation to mitigate reliance on stereotypes while preserving gender agreement in coreferences.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes