CY AIApr 11, 2025

An Evaluation of Cultural Value Alignment in LLM

Nicholas Sukiennik, Chen Gao, Fengli Xu, Yong Li

arXiv:2504.08863v115.837 citationsh-index: 34

Originality Incremental advance

AI Analysis

This work addresses the problem of cultural bias in LLMs for users and developers in global applications, though it is incremental as it builds on prior investigations of cultural representations.

The authors conducted the first large-scale evaluation of cultural value alignment across 20 countries and 10 LLMs, finding that outputs represent a moderate cultural middle ground, with the United States being the best-aligned country and GLM-4 having the best alignment ability.

LLMs as intelligent agents are being increasingly applied in scenarios where human interactions are involved, leading to a critical concern about whether LLMs are faithful to the variations in culture across regions. Several works have investigated this question in various ways, finding that there are biases present in the cultural representations of LLM outputs. To gain a more comprehensive view, in this work, we conduct the first large-scale evaluation of LLM culture assessing 20 countries' cultures and languages across ten LLMs. With a renowned cultural values questionnaire and by carefully analyzing LLM output with human ground truth scores, we thoroughly study LLMs' cultural alignment across countries and among individual models. Our findings show that the output over all models represents a moderate cultural middle ground. Given the overall skew, we propose an alignment metric, revealing that the United States is the best-aligned country and GLM-4 has the best ability to align to cultural values. Deeper investigation sheds light on the influence of model origin, prompt language, and value dimensions on cultural output. Specifically, models, regardless of where they originate, align better with the US than they do with China. The conclusions provide insight to how LLMs can be better aligned to various cultures as well as provoke further discussion of the potential for LLMs to propagate cultural bias and the need for more culturally adaptable models.

View on arXiv PDF

Similar