CLCYLGSIOct 18, 2022

The Tail Wagging the Dog: Dataset Construction Biases of Social Bias Benchmarks

arXiv:2210.10040v2238 citationsh-index: 64
Originality Incremental advance
AI Analysis

This work highlights a critical flaw in current social bias evaluation methods, potentially affecting researchers and practitioners relying on these benchmarks for fairness assessments.

The study examined the reliability of social bias benchmarks by showing that innocuous dataset modifications, like paraphrasing, can significantly alter measured bias scores in models, with changes up to 30% observed in benchmarks like Winogender and BiasNLI.

How reliably can we trust the scores obtained from social bias benchmarks as faithful indicators of problematic social biases in a given language model? In this work, we study this question by contrasting social biases with non-social biases stemming from choices made during dataset construction that might not even be discernible to the human eye. To do so, we empirically simulate various alternative constructions for a given benchmark based on innocuous modifications (such as paraphrasing or random-sampling) that maintain the essence of their social bias. On two well-known social bias benchmarks (Winogender and BiasNLI) we observe that these shallow modifications have a surprising effect on the resulting degree of bias across various models. We hope these troubling observations motivate more robust measures of social biases.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes