CLAISep 2, 2025

Clustering Discourses: Racial Biases in Short Stories about Women Generated by Large Language Models

arXiv:2509.02834v11 citationsh-index: 8STIL
Originality Synthesis-oriented
AI Analysis

This addresses biases in AI-generated content for marginalized groups, though it is incremental as it applies existing methods to new data.

The study investigated racial biases in short stories generated by LLaMA 3.2-3B about Black and white women in Portuguese, analyzing 2100 texts to identify three discursive representations such as social overcoming, and found that coherent texts reinforce historical inequalities through colonial framing.

This study investigates how large language models, in particular LLaMA 3.2-3B, construct narratives about Black and white women in short stories generated in Portuguese. From 2100 texts, we applied computational methods to group semantically similar stories, allowing a selection for qualitative analysis. Three main discursive representations emerge: social overcoming, ancestral mythification and subjective self-realization. The analysis uncovers how grammatically coherent, seemingly neutral texts materialize a crystallized, colonially structured framing of the female body, reinforcing historical inequalities. The study proposes an integrated approach, that combines machine learning techniques with qualitative, manual discourse analysis.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes