CLOct 28, 2023

ProMap: Effective Bilingual Lexicon Induction via Language Model Prompting

arXiv:2310.18778v1124 citationsh-index: 20Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of word translation between languages, particularly benefiting low-resource language translation, but it is incremental as it builds on existing prompting and BLI methods.

The paper tackles Bilingual Lexicon Induction (BLI) by introducing ProMap, a method that uses prompting of pretrained multilingual language models, achieving state-of-the-art results on both rich-resource and low-resource languages and enabling strong performance with fewer than 10 training examples in few-shot scenarios.

Bilingual Lexicon Induction (BLI), where words are translated between two languages, is an important NLP task. While noticeable progress on BLI in rich resource languages using static word embeddings has been achieved. The word translation performance can be further improved by incorporating information from contextualized word embeddings. In this paper, we introduce ProMap, a novel approach for BLI that leverages the power of prompting pretrained multilingual and multidialectal language models to address these challenges. To overcome the employment of subword tokens in these models, ProMap relies on an effective padded prompting of language models with a seed dictionary that achieves good performance when used independently. We also demonstrate the effectiveness of ProMap in re-ranking results from other BLI methods such as with aligned static word embeddings. When evaluated on both rich-resource and low-resource languages, ProMap consistently achieves state-of-the-art results. Furthermore, ProMap enables strong performance in few-shot scenarios (even with less than 10 training examples), making it a valuable tool for low-resource language translation. Overall, we believe our method offers both exciting and promising direction for BLI in general and low-resource languages in particular. ProMap code and data are available at \url{https://github.com/4mekki4/promap}.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes