CLApr 7, 2025

CARE: Multilingual Human Preference Learning for Cultural Awareness

arXiv:2504.05154v57 citationsh-index: 15Has CodeEMNLP
Originality Incremental advance
AI Analysis

This addresses the issue of cultural bias in AI responses for multilingual users, though it is incremental as it builds on existing preference learning methods.

The paper tackles the problem of language models lacking cultural awareness when tuned with generic human preferences, and shows that incorporating native cultural preferences improves performance across various models, with stronger initial cultural models benefiting more from alignment.

Language Models (LMs) are typically tuned with human preferences to produce helpful responses, but the impact of preference tuning on the ability to handle culturally diverse queries remains understudied. In this paper, we systematically analyze how native human cultural preferences can be incorporated into the preference learning process to train more culturally aware LMs. We introduce \textbf{CARE}, a multilingual resource containing 3,490 culturally specific questions and 31.7k responses with human judgments. We demonstrate how a modest amount of high-quality native preferences improves cultural awareness across various LMs, outperforming larger generic preference data. Our analyses reveal that models with stronger initial cultural performance benefit more from alignment, leading to gaps among models developed in different regions with varying access to culturally relevant data. CARE is publicly available at https://github.com/Guochry/CARE.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes