HCAICLCYNov 26, 2025

TALES: A Taxonomy and Analysis of Cultural Representations in LLM-generated Stories

arXiv:2511.21322v24 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of assessing cultural representations in AI-generated content for users and developers, though it is incremental as it focuses on a specific domain and methodology.

The authors tackled the problem of evaluating cultural misrepresentations in LLM-generated stories for diverse Indian cultural identities, finding that 88% of generated stories contain misrepresentations, with higher prevalence in mid- and low-resourced languages and peri-urban regions.

Millions of users across the globe turn to AI chatbots for their creative needs, inviting widespread interest in understanding how they represent diverse cultures. However, evaluating cultural representations in open-ended tasks remains challenging and underexplored. In this work, we present TALES, an evaluation of cultural misrepresentations in LLM-generated stories for diverse Indian cultural identities. First, we develop TALES-Tax, a taxonomy of cultural misrepresentations by collating insights from participants with lived experiences in India through focus groups (N=9) and individual surveys (N=15). Using TALES-Tax, we evaluate 6 models through a large-scale annotation study spanning 2925 annotations from 108 annotators with lived experience and native language proficiency from across 71 regions in India and 14 languages. Concerningly, we find that 88% of the generated stories contain misrepresentations, and such errors are more prevalent in mid- and low-resourced languages and stories based in peri-urban regions in India. We also transform the annotations into TALES-QA, a standalone question bank to evaluate the cultural knowledge of models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes