Racial Disparity in Natural Language Processing: A Case Study of Social Media African-American English
This addresses fairness issues in NLP for minority social groups, highlighting a specific case of algorithmic bias.
The study investigated racial disparity in natural language processing by analyzing language identification for tweets in African-American English, finding that current systems perform more poorly on language from minorities compared to whites and males.
We highlight an important frontier in algorithmic fairness: disparity in the quality of natural language processing algorithms when applied to language from authors of different social groups. For example, current systems sometimes analyze the language of females and minorities more poorly than they do of whites and males. We conduct an empirical analysis of racial disparity in language identification for tweets written in African-American English, and discuss implications of disparity in NLP.