CLFeb 6, 2025

Sports and Women's Sports: Gender Bias in Text Generation with Olympic Data

arXiv:2502.04218v111 citationsh-index: 1NAACL
Originality Synthesis-oriented
AI Analysis

This work addresses bias in AI for marginalized groups, specifically women in sports, but is incremental as it applies known bias detection methods to a new dataset.

The study tackled gender bias in large language models by analyzing Olympic data from parallel men's and women's events, finding that models consistently show bias against women in ambiguous prompts, often retrieving only men's results without acknowledgment.

Large Language Models (LLMs) have been shown to be biased in prior work, as they generate text that is in line with stereotypical views of the world or that is not representative of the viewpoints and values of historically marginalized demographic groups. In this work, we propose using data from parallel men's and women's events at the Olympic Games to investigate different forms of gender bias in language models. We define three metrics to measure bias, and find that models are consistently biased against women when the gender is ambiguous in the prompt. In this case, the model frequently retrieves only the results of the men's event with or without acknowledging them as such, revealing pervasive gender bias in LLMs in the context of athletics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes