CLAIIRLGFeb 15, 2024

Self-Augmented In-Context Learning for Unsupervised Word Translation

Cambridge
arXiv:2402.10024v228 citationsh-index: 10ACL
AI Analysis

This addresses the challenge of improving LLM performance in unsupervised bilingual lexicon induction for various language pairs, representing a novel method for a known bottleneck.

The paper tackled the problem of unsupervised word translation without seed pairs, especially for lower-resource languages, by proposing self-augmented in-context learning (SAIL), which iteratively induces high-confidence translation pairs from an LLM and reapplies them, resulting in substantial gains over zero-shot prompting and outperforming mapping-based baselines across benchmarks.

Recent work has shown that, while large language models (LLMs) demonstrate strong word translation or bilingual lexicon induction (BLI) capabilities in few-shot setups, they still cannot match the performance of 'traditional' mapping-based approaches in the unsupervised scenario where no seed translation pairs are available, especially for lower-resource languages. To address this challenge with LLMs, we propose self-augmented in-context learning (SAIL) for unsupervised BLI: starting from a zero-shot prompt, SAIL iteratively induces a set of high-confidence word translation pairs for in-context learning (ICL) from an LLM, which it then reapplies to the same LLM in the ICL fashion. Our method shows substantial gains over zero-shot prompting of LLMs on two established BLI benchmarks spanning a wide range of language pairs, also outperforming mapping-based baselines across the board. In addition to achieving state-of-the-art unsupervised BLI performance, we also conduct comprehensive analyses on SAIL and discuss its limitations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes