CL CYJun 19, 2024

Leveraging Large Language Models to Measure Gender Representation Bias in Gendered Language Corpora

Erik Derner, Sara Sansalvador de la Fuente, Yoan Gutiérrez, Paloma Moreda, Nuria Oliver

arXiv:2406.13677v34.87 citations

Originality Incremental advance

AI Analysis

This addresses gender bias in multilingual NLP by focusing on upstream data imbalances, which is an incremental but important step for improving fairness in AI systems.

The paper tackled the problem of gender representation bias in LLM training data for gendered languages, proposing an LLM-based method to detect and quantify it, and found substantial male-dominant imbalances in Spanish-English and Valencian corpora, with mitigation possible through small-scale training on oppositely biased datasets.

Large language models (LLMs) often inherit and amplify social biases embedded in their training data. A prominent social bias is gender bias. In this regard, prior work has mainly focused on gender stereotyping bias - the association of specific roles or traits with a particular gender - in English and on evaluating gender bias in model embeddings or generated outputs. In contrast, gender representation bias - the unequal frequency of references to individuals of different genders - in the training corpora has received less attention. Yet such imbalances in the training data constitute an upstream source of bias that can propagate and intensify throughout the entire model lifecycle. To fill this gap, we propose a novel LLM-based method to detect and quantify gender representation bias in LLM training data in gendered languages, where grammatical gender challenges the applicability of methods developed for English. By leveraging the LLMs' contextual understanding, our approach automatically identifies and classifies person-referencing words in gendered language corpora. Applied to four Spanish-English benchmarks and five Valencian corpora, our method reveals substantial male-dominant imbalances. We show that such biases in training data affect model outputs, but can surprisingly be mitigated leveraging small-scale training on datasets that are biased towards the opposite gender. Our findings highlight the need for corpus-level gender bias analysis in multilingual NLP. We make our code and data publicly available.

View on arXiv PDF

Similar