CYApr 14

Can Persona-Prompted LLMs Emulate Subgroup Values? An Empirical Analysis of Generalisability and Fairness in Cultural Alignment

arXiv:2604.1285189.0h-index: 11
Predicted impact top 3% in CY · last 90 daysOriginality Incremental advance
AI Analysis

For AI alignment researchers, this paper reveals that fine-tuning for subgroup value alignment improves average performance but exacerbates fairness gaps across demographic groups.

LLMs poorly emulate subgroup cultural values (GPT-4.1 achieves 57.4% accuracy on Singapore subgroups), but fine-tuning on structured preferences improves accuracy by 17.4% on unseen subgroups, though it widens performance disparities.

Despite their global prevalence, many Large Language Models (LLMs) are aligned to a monolithic, often Western-centric set of values. This paper investigates the more challenging task of fine-grained value alignment: examining whether LLMs can emulate the distinct cultural values of demographic subgroups. Using Singapore as a case study and the World Values Survey (WVS), we examine the value landscape and show that even state-of-the-art models like GPT-4.1 achieve only 57.4% accuracy in predicting subgroup modal preferences. We construct a dataset of over 20,000 samples to train and evaluate a range of models. We demonstrate that simple fine-tuning on structured numerical preferences yields substantial gains, improving accuracy on unseen, out-of-distribution subgroups by an average of 17.4%. These gains partially transfer to open-ended generation. However, we find significant pre-existing performance biases, where models better emulate young, male, Chinese, and Christian personas. Furthermore, while fine-tuning improves average performance, it widens the disparity between subgroups when measured by distance-aware metrics. Our work offers insights into the limits and fairness implications of subgroup-level cultural alignment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes