CLApr 15, 2021

Toward Deconfounding the Influence of Entity Demographics for Question Answering Accuracy

arXiv:2104.07571v211 citations
AI Analysis

This work addresses the problem of hidden biases in QA datasets for researchers and practitioners, though it is incremental as it builds on existing bias analysis.

The study examined whether skewed demographic distributions in major QA datasets affect model accuracy, finding little evidence of lower accuracy based on gender or nationality but more variation by profession, and concluded that better representation is needed to uncover potential biases.

The goal of question answering (QA) is to answer any question. However, major QA datasets have skewed distributions over gender, profession, and nationality. Despite that skew, model accuracy analysis reveals little evidence that accuracy is lower for people based on gender or nationality; instead, there is more variation on professions (question topic). But QA's lack of representation could itself hide evidence of bias, necessitating QA datasets that better represent global diversity.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes