CL AIAug 3, 2021

Q-Pain: A Question Answering Dataset to Measure Social Bias in Pain Management

Cécile Logé, Emily Ross, David Yaw Amoah Dadey, Saahil Jain, Adriel Saporta, Andrew Y. Ng, Pranav Rajpurkar

arXiv:2108.01764v13.029 citations

Originality Incremental advance

AI Analysis

This addresses the risk of AI bias in medical decision-making, particularly for pain management, but is incremental as it builds on existing bias assessment methods.

The authors tackled the problem of social bias in medical question answering by introducing Q-Pain, a dataset for assessing bias in pain management, and found statistically significant differences in treatment recommendations between race-gender subgroups when testing GPT-2 and GPT-3.

Recent advances in Natural Language Processing (NLP), and specifically automated Question Answering (QA) systems, have demonstrated both impressive linguistic fluency and a pernicious tendency to reflect social biases. In this study, we introduce Q-Pain, a dataset for assessing bias in medical QA in the context of pain management, one of the most challenging forms of clinical decision-making. Along with the dataset, we propose a new, rigorous framework, including a sample experimental design, to measure the potential biases present when making treatment decisions. We demonstrate its use by assessing two reference Question-Answering systems, GPT-2 and GPT-3, and find statistically significant differences in treatment between intersectional race-gender subgroups, thus reaffirming the risks posed by AI in medical settings, and the need for datasets like ours to ensure safety before medical AI applications are deployed.

View on arXiv PDF

Similar