CLOct 16, 2020

It's not Greek to mBERT: Inducing Word-Level Translations from Multilingual BERT

arXiv:2010.08275v11003 citations
Originality Incremental advance
AI Analysis

This work provides insights into cross-lingual transfer for natural language processing researchers, though it is incremental in analyzing existing models.

The paper investigated the word-level translation capabilities embedded in multilingual BERT (mBERT) without fine-tuning, revealing that most translation information is encoded non-linearly, with some recoverable linearly, and identified a language-identity subspace within its representations.

Recent works have demonstrated that multilingual BERT (mBERT) learns rich cross-lingual representations, that allow for transfer across languages. We study the word-level translation information embedded in mBERT and present two simple methods that expose remarkable translation capabilities with no fine-tuning. The results suggest that most of this information is encoded in a non-linear way, while some of it can also be recovered with purely linear tools. As part of our analysis, we test the hypothesis that mBERT learns representations which contain both a language-encoding component and an abstract, cross-lingual component, and explicitly identify an empirical language-identity subspace within mBERT representations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes