CLNov 18, 2025

Bias in, Bias out: Annotation Bias in Multilingual Large Language Models

arXiv:2511.14662v11 citationsProceedings of the First Interdisciplinary Workshop on Observations of Misunderstood, Misguided and Malicious Use of Language Models
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of developing equitable multilingual LLMs for culturally diverse settings, though it appears incremental as it synthesizes and adapts existing ideas.

The paper tackles the problem of annotation bias in multilingual large language models, which distorts outputs and causes social harms, by proposing a comprehensive framework for understanding, detecting, and mitigating such bias through methods like a typology, detection metrics, and an ensemble-based mitigation approach.

Annotation bias in NLP datasets remains a major challenge for developing multilingual Large Language Models (LLMs), particularly in culturally diverse settings. Bias from task framing, annotator subjectivity, and cultural mismatches can distort model outputs and exacerbate social harms. We propose a comprehensive framework for understanding annotation bias, distinguishing among instruction bias, annotator bias, and contextual and cultural bias. We review detection methods (including inter-annotator agreement, model disagreement, and metadata analysis) and highlight emerging techniques such as multilingual model divergence and cultural inference. We further outline proactive and reactive mitigation strategies, including diverse annotator recruitment, iterative guideline refinement, and post-hoc model adjustments. Our contributions include: (1) a typology of annotation bias; (2) a synthesis of detection metrics; (3) an ensemble-based bias mitigation approach adapted for multilingual settings, and (4) an ethical analysis of annotation processes. Together, these insights aim to inform more equitable and culturally grounded annotation pipelines for LLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes