CLOct 11, 2022

Are Pretrained Multilingual Models Equally Fair Across Languages?

arXiv:2210.05457v131.0586 citationsh-index: 46Has Code

Originality Incremental advance

AI Analysis

This addresses fairness concerns in multilingual NLP applications, particularly for lower-resourced languages, though it is incremental as it extends existing fairness scrutiny from monolingual to multilingual models.

The study investigated whether pretrained multilingual language models exhibit equal group fairness across different languages, finding that models like mBERT, XLM-R, and mT5 show varying levels of disparity, such as near-equal risk for Spanish but high disparity for German.

Pretrained multilingual language models can help bridge the digital language divide, enabling high-quality NLP models for lower resourced languages. Studies of multilingual models have so far focused on performance, consistency, and cross-lingual generalisation. However, with their wide-spread application in the wild and downstream societal impact, it is important to put multilingual models under the same scrutiny as monolingual models. This work investigates the group fairness of multilingual models, asking whether these models are equally fair across languages. To this end, we create a new four-way multilingual dataset of parallel cloze test examples (MozArt), equipped with demographic information (balanced with regard to gender and native tongue) about the test participants. We evaluate three multilingual models on MozArt -- mBERT, XLM-R, and mT5 -- and show that across the four target languages, the three models exhibit different levels of group disparity, e.g., exhibiting near-equal risk for Spanish, but high levels of disparity for German.

View on arXiv PDF Code

Similar