CLAILGDec 4, 2022

Cross-lingual Similarity of Multilingual Representations Revisited

arXiv:2212.01924v1297 citationsh-index: 22Has Code
Originality Incremental advance
AI Analysis

This work addresses a methodological gap for researchers analyzing cross-lingual representations, though it is incremental as it builds on prior similarity measures.

The paper tackles the problem of measuring cross-lingual similarity in multilingual models, arguing that existing indexes like CKA and CCA fail to capture aspects relevant to zero-shot transfer, and introduces ANC as an alternative, showing it reveals alignment patterns in both MLMs and CLMs up to scaled versions.

Related works used indexes like CKA and variants of CCA to measure the similarity of cross-lingual representations in multilingual language models. In this paper, we argue that assumptions of CKA/CCA align poorly with one of the motivating goals of cross-lingual learning analysis, i.e., explaining zero-shot cross-lingual transfer. We highlight what valuable aspects of cross-lingual similarity these indexes fail to capture and provide a motivating case study \textit{demonstrating the problem empirically}. Then, we introduce \textit{Average Neuron-Wise Correlation (ANC)} as a straightforward alternative that is exempt from the difficulties of CKA/CCA and is good specifically in a cross-lingual context. Finally, we use ANC to construct evidence that the previously introduced ``first align, then predict'' pattern takes place not only in masked language models (MLMs) but also in multilingual models with \textit{causal language modeling} objectives (CLMs). Moreover, we show that the pattern extends to the \textit{scaled versions} of the MLMs and CLMs (up to 85x original mBERT).\footnote{Our code is publicly available at \url{https://github.com/TartuNLP/xsim}}

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes