CLApr 2

Mitigating Cross-Lingual Cultural Inconsistencies in LLMs via Consensus-Driven Preference Optimisation

Lucas Resck, Isabelle Augenstein, Anna Korhonen

arXiv:2605.1251590.2

Predicted impact top 28% in CL · last 90 daysOriginality Incremental advance

AI Analysis

Addresses a critical failure in multilingual LLMs where prompt language overrides user persona, particularly affecting users of lower-resource languages.

Multilingual LLMs exhibit cross-lingual cultural inconsistency, e.g., answering literature queries differently based on prompt language despite a fixed persona. The proposed C-3PO method improves consistency by up to 0.10 in κ_S, with larger gains for lower-resource languages.

Despite their impressive capabilities, multilingual large language models (MLLMs) frequently exhibit inconsistent behaviour when the prompt's language changes. While such adaptation is generally desirable, it becomes a critical failure when a user's identity is explicitly defined. For instance, given a fixed British persona and an ambiguous everyday knowledge query about literature, the prompt's language frequently overwrites the system persona -- yielding Shakespeare in English but Cervantes in Spanish. To robustly quantify this Cross-lingual Cultural Inconsistency, we introduce Singleton Fleiss's $κ_S$, a metric mathematically resilient to hallucinations. For mitigation, we propose Cross-lingual Cultural Consistent Preference Optimisation (C-3PO), a consensus-driven alignment framework. C-3PO achieves up to a 0.10-point absolute increase in $κ_S$ over unaligned models, outperforming strong prompting and representation steering baselines. Empirical evaluations show this inconsistency disproportionately affects lower-resource languages like Indonesian and Persian. A layer-wise interpretability analysis reveals the underlying mechanism: by early-decoding intermediate layer representations, we find that MLLMs implicitly personalise outputs towards the prompt language's stereotypical culture as forward-pass representations stabilise.

View on arXiv PDF

Similar