Evaluating Gender Bias in Natural Language Inference
This work addresses ethical concerns in NLP by evaluating gender bias in NLI models, which is an incremental step in detection and evaluation methods.
The authors tackled the problem of gender bias in natural language inference by proposing an evaluation methodology using a challenge task with gender-neutral premises and gender-specific hypotheses, finding that three state-of-the-art models (BERT, RoBERTa, BART) are significantly prone to gender-induced prediction errors, and that debiasing techniques like dataset augmentation can reduce bias in some cases.
Gender-bias stereotypes have recently raised significant ethical concerns in natural language processing. However, progress in detection and evaluation of gender bias in natural language understanding through inference is limited and requires further investigation. In this work, we propose an evaluation methodology to measure these biases by constructing a challenge task that involves pairing gender-neutral premises against a gender-specific hypothesis. We use our challenge task to investigate state-of-the-art NLI models on the presence of gender stereotypes using occupations. Our findings suggest that three models (BERT, RoBERTa, BART) trained on MNLI and SNLI datasets are significantly prone to gender-induced prediction errors. We also find that debiasing techniques such as augmenting the training dataset to ensure a gender-balanced dataset can help reduce such bias in certain cases.