CLAIJan 8

From National Curricula to Cultural Awareness: Constructing Open-Ended Culture-Specific Question Answering Dataset

arXiv:2601.04632v1
Originality Incremental advance
AI Analysis

This addresses the problem of cultural bias in LLMs for researchers and practitioners, though it is incremental as it builds on existing curriculum-based methods.

The authors tackled the uneven cultural alignment of large language models by developing CuCu, a multi-agent LLM framework that automatically generates culture-specific question-answer pairs from national curricula, resulting in KCaQA, a dataset of 34.1k open-ended QA pairs for Korean social studies.

Large language models (LLMs) achieve strong performance on many tasks, but their progress remains uneven across languages and cultures, often reflecting values latent in English-centric training data. To enable practical cultural alignment, we propose a scalable approach that leverages national social studies curricula as a foundation for culture-aware supervision. We introduce CuCu, an automated multi-agent LLM framework that transforms national textbook curricula into open-ended, culture-specific question-answer pairs. Applying CuCu to the Korean national social studies curriculum, we construct KCaQA, comprising 34.1k open-ended QA pairs. Our quantitative and qualitative analyses suggest that KCaQA covers culture-specific topics and produces responses grounded in local sociocultural contexts.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes