CLOct 13, 2025

Culturally-Aware Conversations: A Framework & Benchmark for LLMs

Shreya Havaldar, Sunny Rai, Young-Min Cho, Lyle Ungar

arXiv:2510.11563v16 citationsh-index: 20Proceedings of the Fourth Workshop on Bridging Human-Computer Interaction and Natural Language Processing (HCI+NLP)

Originality Incremental advance

AI Analysis

This addresses the challenge of cultural misalignment in LLMs for users from diverse backgrounds, though it is incremental as it builds on existing evaluation methods.

The paper tackled the problem of evaluating LLMs in multicultural conversational settings by introducing a framework and benchmark grounded in sociocultural theory, showing that current top LLMs struggle with cultural adaptation.

Existing benchmarks that measure cultural adaptation in LLMs are misaligned with the actual challenges these models face when interacting with users from diverse cultural backgrounds. In this work, we introduce the first framework and benchmark designed to evaluate LLMs in realistic, multicultural conversational settings. Grounded in sociocultural theory, our framework formalizes how linguistic style - a key element of cultural communication - is shaped by situational, relational, and cultural context. We construct a benchmark dataset based on this framework, annotated by culturally diverse raters, and propose a new set of desiderata for cross-cultural evaluation in NLP: conversational framing, stylistic sensitivity, and subjective correctness. We evaluate today's top LLMs on our benchmark and show that these models struggle with cultural adaptation in a conversational setting.

View on arXiv PDF

Similar