CLApr 7, 2022

Mapping the Multilingual Margins: Intersectional Biases of Sentiment Analysis Systems in English, Spanish, and Arabic

António Câmara, Nina Taneja, Tamjeed Azad, Emily Allaway, Richard Zemel

arXiv:2204.03558v132.1642 citationsh-index: 78

Originality Incremental advance

AI Analysis

This addresses fairness issues in NLP for multilingual and intersectional contexts, though it is incremental as it builds on existing bias measurement work.

The paper tackled the problem of fairness in sentiment analysis systems by introducing multilingual Equity Evaluation Corpora and a statistical framework to measure social biases, finding that many systems exhibit statistically significant unisectional and intersectional biases across English, Spanish, and Arabic.

As natural language processing systems become more widespread, it is necessary to address fairness issues in their implementation and deployment to ensure that their negative impacts on society are understood and minimized. However, there is limited work that studies fairness using a multilingual and intersectional framework or on downstream tasks. In this paper, we introduce four multilingual Equity Evaluation Corpora, supplementary test sets designed to measure social biases, and a novel statistical framework for studying unisectional and intersectional social biases in natural language processing. We use these tools to measure gender, racial, ethnic, and intersectional social biases across five models trained on emotion regression tasks in English, Spanish, and Arabic. We find that many systems demonstrate statistically significant unisectional and intersectional social biases.

View on arXiv PDF

Similar