CLMLApr 15, 2021

Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models

arXiv:2104.07505v228 citations
Originality Incremental advance
AI Analysis

This work addresses societal bias in AI for multilingual applications, but it is incremental as it extends existing bias quantification methods to a broader linguistic and political context.

The paper tackled the problem of quantifying gender bias towards politicians in cross-lingual language models by probing models across seven languages and six architectures, finding that bias varies strongly by language and that larger models are not significantly more biased than smaller ones.

Recent research has demonstrated that large pre-trained language models reflect societal biases expressed in natural language. The present paper introduces a simple method for probing language models to conduct a multilingual study of gender bias towards politicians. We quantify the usage of adjectives and verbs generated by language models surrounding the names of politicians as a function of their gender. To this end, we curate a dataset of 250k politicians worldwide, including their names and gender. Our study is conducted in seven languages across six different language modeling architectures. The results demonstrate that pre-trained language models' stance towards politicians varies strongly across analyzed languages. We find that while some words such as dead, and designated are associated with both male and female politicians, a few specific words such as beautiful and divorced are predominantly associated with female politicians. Finally, and contrary to previous findings, our study suggests that larger language models do not tend to be significantly more gender-biased than smaller ones.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes