CLDec 23, 2021

Do Multi-Lingual Pre-trained Language Models Reveal Consistent Token Attributions in Different Languages?

arXiv:2112.12356v1
Originality Incremental advance
AI Analysis

This addresses a fundamental understanding gap for researchers and practitioners using multi-lingual models, but it is incremental as it builds on existing evaluation frameworks.

The paper tackles the problem of understanding why multi-lingual pre-trained language models perform well by investigating whether they assign consistent token attributions across different languages, finding that they assign significantly different attributions to multi-lingual synonyms and that consistency correlates with downstream task performance.

During the past several years, a surge of multi-lingual Pre-trained Language Models (PLMs) has been proposed to achieve state-of-the-art performance in many cross-lingual downstream tasks. However, the understanding of why multi-lingual PLMs perform well is still an open domain. For example, it is unclear whether multi-Lingual PLMs reveal consistent token attributions in different languages. To address this, in this paper, we propose a Cross-lingual Consistency of Token Attributions (CCTA) evaluation framework. Extensive experiments in three downstream tasks demonstrate that multi-lingual PLMs assign significantly different attributions to multi-lingual synonyms. Moreover, we have the following observations: 1) the Spanish achieves the most consistent token attributions in different languages when it is used for training PLMs; 2) the consistency of token attributions strongly correlates with performance in downstream tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes