CLOct 6, 2022

Debiasing isn't enough! -- On the Effectiveness of Debiasing MLMs and their Social Biases in Downstream Tasks

arXiv:2210.02938v1598 citationsh-index: 32
Originality Synthesis-oriented
AI Analysis

This highlights limitations in current bias evaluation methods and raises concerns for deploying MLMs in downstream applications, representing an incremental analysis of existing issues.

The study found a weak correlation between task-agnostic and task-specific social bias evaluations in Masked Language Models (MLMs), and that debiased MLMs re-learn biases during fine-tuning due to biases in training data and labels.

We study the relationship between task-agnostic intrinsic and task-specific extrinsic social bias evaluation measures for Masked Language Models (MLMs), and find that there exists only a weak correlation between these two types of evaluation measures. Moreover, we find that MLMs debiased using different methods still re-learn social biases during fine-tuning on downstream tasks. We identify the social biases in both training instances as well as their assigned labels as reasons for the discrepancy between intrinsic and extrinsic bias evaluation measurements. Overall, our findings highlight the limitations of existing MLM bias evaluation measures and raise concerns on the deployment of MLMs in downstream applications using those measures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes