A Survey on Bias and Fairness in Natural Language Processing
This addresses the issue of harmful biases in NLP systems for users and society, but it is incremental as it synthesizes existing research rather than introducing new methods.
The paper surveys the problem of biases in NLP models, analyzing their origins, fairness definitions, and mitigation methods across subfields, with a focus on how these models amplify stereotypes and impact social settings.
As NLP models become more integrated with the everyday lives of people, it becomes important to examine the social effect that the usage of these systems has. While these models understand language and have increased accuracy on difficult downstream tasks, there is evidence that these models amplify gender, racial and cultural stereotypes and lead to a vicious cycle in many settings. In this survey, we analyze the origins of biases, the definitions of fairness, and how different subfields of NLP mitigate bias. We finally discuss how future studies can work towards eradicating pernicious biases from NLP algorithms.