CLDec 23, 2023

Multilingual Bias Detection and Mitigation for Indian Languages

Ankita Maity, Anubhav Sharma, Rudra Dhar, Tushar Abhishek, Manish Gupta, Vasudeva Varma

arXiv:2312.15181v116.679 citationsWILDRE

Originality Synthesis-oriented

AI Analysis

This work addresses a critical issue for millions of readers in India by providing the first solutions for bias detection and mitigation in Indian languages, though it is incremental as it applies existing methods to new data.

The paper tackled the problem of neutrality bias in Wikipedia content for Indian languages by creating two large datasets covering 8 languages and evaluating multilingual Transformer models for bias detection and mitigation, achieving results that show these models can effectively address the tasks.

Lack of diverse perspectives causes neutrality bias in Wikipedia content leading to millions of worldwide readers getting exposed by potentially inaccurate information. Hence, neutrality bias detection and mitigation is a critical problem. Although previous studies have proposed effective solutions for English, no work exists for Indian languages. First, we contribute two large datasets, mWikiBias and mWNC, covering 8 languages, for the bias detection and mitigation tasks respectively. Next, we investigate the effectiveness of popular multilingual Transformer-based models for the two tasks by modeling detection as a binary classification problem and mitigation as a style transfer problem. We make the code and data publicly available.

View on arXiv PDF

Similar