Gender-tuning: Empowering Fine-tuning for Debiasing Pre-trained Language Models
This addresses the issue of bias propagation in language models for users in NLP applications, offering a deployable solution that is incremental over existing fine-tuning methods.
The paper tackles the problem of societal biases in pre-trained language models by proposing Gender-tuning, a fine-tuning method that integrates masked language modeling objectives to debias models without requiring additional resources, achieving improved gender bias scores and downstream task performance.
Recent studies have revealed that the widely-used Pre-trained Language Models (PLMs) propagate societal biases from the large unmoderated pre-training corpora. Existing solutions require debiasing training processes and datasets for debiasing, which are resource-intensive and costly. Furthermore, these methods hurt the PLMs' performance on downstream tasks. In this study, we propose Gender-tuning, which debiases the PLMs through fine-tuning on downstream tasks' datasets. For this aim, Gender-tuning integrates Masked Language Modeling (MLM) training objectives into fine-tuning's training process. Comprehensive experiments show that Gender-tuning outperforms the state-of-the-art baselines in terms of average gender bias scores in PLMs while improving PLMs' performance on downstream tasks solely using the downstream tasks' dataset. Also, Gender-tuning is a deployable debiasing tool for any PLM that works with original fine-tuning.