CLMay 11, 2018

Examining Gender and Race Bias in Two Hundred Sentiment Analysis Systems

arXiv:1805.04508v134.41248 citations

Originality Synthesis-oriented

AI Analysis

This addresses bias in AI systems for fairness and ethics, though it is incremental as it builds on existing bias examination work by introducing a new benchmark.

The authors tackled the problem of bias in sentiment analysis systems by creating the Equity Evaluation Corpus (EEC) with 8,640 sentences to test for gender and race biases, and found that several of the 219 systems showed statistically significant biases in sentiment intensity predictions.

Automatic machine learning systems can inadvertently accentuate and perpetuate inappropriate human biases. Past work on examining inappropriate biases has largely focused on just individual systems. Further, there is no benchmark dataset for examining inappropriate biases in systems. Here for the first time, we present the Equity Evaluation Corpus (EEC), which consists of 8,640 English sentences carefully chosen to tease out biases towards certain races and genders. We use the dataset to examine 219 automatic sentiment analysis systems that took part in a recent shared task, SemEval-2018 Task 1 'Affect in Tweets'. We find that several of the systems show statistically significant bias; that is, they consistently provide slightly higher sentiment intensity predictions for one race or one gender. We make the EEC freely available.

View on arXiv PDF

Similar