Gender Bias Mitigation for Bangla Classification Tasks
This addresses gender bias in low-resource Bangla language models, which is an incremental improvement for NLP applications in underrepresented languages.
The study tackled gender bias in Bangla pretrained language models by creating four manually annotated datasets for tasks like sentiment analysis and proposing a joint loss optimization technique to mitigate bias. The results showed effective bias reduction while maintaining competitive accuracy compared to baseline methods.
In this study, we investigate gender bias in Bangla pretrained language models, a largely under explored area in low-resource languages. To assess this bias, we applied gender-name swapping techniques to existing datasets, creating four manually annotated, task-specific datasets for sentiment analysis, toxicity detection, hate speech detection, and sarcasm detection. By altering names and gender-specific terms, we ensured these datasets were suitable for detecting and mitigating gender bias. We then proposed a joint loss optimization technique to mitigate gender bias across task-specific pretrained models. Our approach was evaluated against existing bias mitigation methods, with results showing that our technique not only effectively reduces bias but also maintains competitive accuracy compared to other baseline approaches. To promote further research, we have made both our implementation and datasets publicly available https://github.com/sajib-kumar/Gender-Bias-Mitigation-From-Bangla-PLM