CLMay 30, 2019

Reducing Gender Bias in Word-Level Language Models with a Gender-Equalizing Loss Function

arXiv:1905.12801v21121 citations
Originality Incremental advance
AI Analysis

This addresses gender bias in language models for NLP applications, though it is incremental as it builds on prior debiasing strategies.

The authors tackled gender bias in word-level language models by proposing a loss function modification that equalizes probabilities of male and female words, resulting in reduced bias without increasing perplexity and outperforming existing methods, especially for occupation words.

Gender bias exists in natural language datasets which neural language models tend to learn, resulting in biased text generation. In this research, we propose a debiasing approach based on the loss function modification. We introduce a new term to the loss function which attempts to equalize the probabilities of male and female words in the output. Using an array of bias evaluation metrics, we provide empirical evidence that our approach successfully mitigates gender bias in language models without increasing perplexity. In comparison to existing debiasing strategies, data augmentation, and word embedding debiasing, our method performs better in several aspects, especially in reducing gender bias in occupation words. Finally, we introduce a combination of data augmentation and our approach, and show that it outperforms existing strategies in all bias evaluation metrics.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes