CLApr 13, 2022

The Impact of Cross-Lingual Adjustment of Contextual Word Representations on Zero-Shot Transfer

Amazon
arXiv:2204.06457v211 citationsh-index: 19
Originality Incremental advance
AI Analysis

This work addresses zero-shot transfer challenges for multilingual NLP applications, but it is incremental as it builds on prior methods with new languages and tasks.

The study extended a cross-lingual adjustment method to diverse languages and tasks, showing improved performance in NLI, NER, XSR, and cross-lingual QA for some languages, but noted degradation in monolingual QA and used continual learning to mitigate forgetting of alignment.

Large multilingual language models such as mBERT or XLM-R enable zero-shot cross-lingual transfer in various IR and NLP tasks. Cao et al. (2020) proposed a data- and compute-efficient method for cross-lingual adjustment of mBERT that uses a small parallel corpus to make embeddings of related words across languages similar to each other. They showed it to be effective in NLI for five European languages. In contrast we experiment with a typologically diverse set of languages (Spanish, Russian, Vietnamese, and Hindi) and extend their original implementations to new tasks (XSR, NER, and QA) and an additional training regime (continual learning). Our study reproduced gains in NLI for four languages, showed improved NER, XSR, and cross-lingual QA results in three languages (though some cross-lingual QA gains were not statistically significant), while mono-lingual QA performance never improved and sometimes degraded. Analysis of distances between contextualized embeddings of related and unrelated words (across languages) showed that fine-tuning leads to "forgetting" some of the cross-lingual alignment information. Based on this observation, we further improved NLI performance using continual learning.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes