CLOct 19, 2023

Co$^2$PT: Mitigating Bias in Pre-trained Language Models through Counterfactual Contrastive Prompt Tuning

AmazonMicrosoft
arXiv:2310.12490v120 citationsh-index: 53
Originality Incremental advance
AI Analysis

It addresses bias mitigation in language models for downstream applications, but appears incremental as it builds on existing debiasing and prompt tuning techniques.

The paper tackles the problem of social biases in pre-trained language models by proposing Co$^2$PT, a method for debiasing during prompt tuning, which shows effectiveness on three bias benchmarks.

Pre-trained Language Models are widely used in many important real-world applications. However, recent studies show that these models can encode social biases from large pre-training corpora and even amplify biases in downstream applications. To address this challenge, we propose Co$^2$PT, an efficient and effective debias-while-prompt tuning method for mitigating biases via counterfactual contrastive prompt tuning on downstream tasks. Our experiments conducted on three extrinsic bias benchmarks demonstrate the effectiveness of Co$^2$PT on bias mitigation during the prompt tuning process and its adaptability to existing upstream debiased language models. These findings indicate the strength of Co$^2$PT and provide promising avenues for further enhancement in bias mitigation on downstream tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes