CLDec 23, 2023

Multilingual Bias Detection and Mitigation for Indian Languages

arXiv:2312.15181v179 citationsWILDRE
Originality Synthesis-oriented
AI Analysis

This work addresses a critical issue for millions of readers in India by providing the first solutions for bias detection and mitigation in Indian languages, though it is incremental as it applies existing methods to new data.

The paper tackled the problem of neutrality bias in Wikipedia content for Indian languages by creating two large datasets covering 8 languages and evaluating multilingual Transformer models for bias detection and mitigation, achieving results that show these models can effectively address the tasks.

Lack of diverse perspectives causes neutrality bias in Wikipedia content leading to millions of worldwide readers getting exposed by potentially inaccurate information. Hence, neutrality bias detection and mitigation is a critical problem. Although previous studies have proposed effective solutions for English, no work exists for Indian languages. First, we contribute two large datasets, mWikiBias and mWNC, covering 8 languages, for the bias detection and mitigation tasks respectively. Next, we investigate the effectiveness of popular multilingual Transformer-based models for the two tasks by modeling detection as a binary classification problem and mitigation as a style transfer problem. We make the code and data publicly available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes