CL AIMay 25, 2022

Perturbation Augmentation for Fairer NLP

Rebecca Qian, Candace Ross, Jude Fernandes, Eric Smith, Douwe Kiela, Adina Williams

Meta AI

arXiv:2205.12586v225.5322 citationsh-index: 23Has Code

Originality Highly original

AI Analysis

This work addresses fairness issues in NLP for researchers and practitioners, offering a novel method to reduce demographic bias in language models.

The paper tackles the problem of social biases in NLP by training language models on demographically perturbed data, finding that this leads to fairer models without sacrificing performance on downstream tasks.

Unwanted and often harmful social biases are becoming ever more salient in NLP research, affecting both models and datasets. In this work, we ask whether training on demographically perturbed data leads to fairer language models. We collect a large dataset of human annotated text perturbations and train a neural perturbation model, which we show outperforms heuristic alternatives. We find that (i) language models (LMs) pre-trained on demographically perturbed corpora are typically more fair, and (ii) LMs finetuned on perturbed GLUE datasets exhibit less demographic bias on downstream tasks, and (iii) fairness improvements do not come at the expense of performance on downstream tasks. Lastly, we discuss outstanding questions about how best to evaluate the (un)fairness of large language models. We hope that this exploration of neural demographic perturbation will help drive more improvement towards fairer NLP.

View on arXiv PDF Code

Similar