CLNov 17, 2025

How Good is BLI as an Alignment Measure: A Study in Word Embedding Paradigm

arXiv:2511.13040v11 citationsh-index: 8
Originality Incremental advance
AI Analysis

This work addresses the evaluation of word embedding alignment for researchers in NLP, particularly for multilingual applications, though it is incremental in proposing refinements to existing metrics.

The study investigated whether Bilingual Lexicon Induction (BLI) accurately measures alignment between word embedding spaces, finding that BLI fails to capture true alignment in some cases and proposing stem-based BLI and vocabulary pruning techniques to address this. They evaluated traditional alignment methods, multilingual models, and combined techniques across high- and low-resource languages, noting that combined techniques generally perform better except in low-resource cases where multilingual embeddings excel.

Sans a dwindling number of monolingual embedding studies originating predominantly from the low-resource domains, it is evident that multilingual embedding has become the de facto choice due to its adaptability to the usage of code-mixed languages, granting the ability to process multilingual documents in a language-agnostic manner, as well as removing the difficult task of aligning monolingual embeddings. But is this victory complete? Are the multilingual models better than aligned monolingual models in every aspect? Can the higher computational cost of multilingual models always be justified? Or is there a compromise between the two extremes? Bilingual Lexicon Induction is one of the most widely used metrics in terms of evaluating the degree of alignment between two embedding spaces. In this study, we explore the strengths and limitations of BLI as a measure to evaluate the degree of alignment of two embedding spaces. Further, we evaluate how well traditional embedding alignment techniques, novel multilingual models, and combined alignment techniques perform BLI tasks in the contexts of both high-resource and low-resource languages. In addition to that, we investigate the impact of the language families to which the pairs of languages belong. We identify that BLI does not measure the true degree of alignment in some cases and we propose solutions for them. We propose a novel stem-based BLI approach to evaluate two aligned embedding spaces that take into account the inflected nature of languages as opposed to the prevalent word-based BLI techniques. Further, we introduce a vocabulary pruning technique that is more informative in showing the degree of the alignment, especially performing BLI on multilingual embedding models. Often, combined embedding alignment techniques perform better while in certain cases multilingual embeddings perform better (mainly low-resource language cases).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes