CL AI IR LGFeb 15, 2024

Self-Augmented In-Context Learning for Unsupervised Word Translation

Cambridge

arXiv:2402.10024v215.428 citationsh-index: 10Has CodeACL

Originality Highly original

AI Analysis

This addresses the challenge of improving LLM performance in unsupervised bilingual lexicon induction for various language pairs, representing a novel method for a known bottleneck.

The paper tackled the problem of unsupervised word translation without seed pairs, especially for lower-resource languages, by proposing self-augmented in-context learning (SAIL), which iteratively induces high-confidence translation pairs from an LLM and reapplies them, resulting in substantial gains over zero-shot prompting and outperforming mapping-based baselines across benchmarks.

Recent work has shown that, while large language models (LLMs) demonstrate strong word translation or bilingual lexicon induction (BLI) capabilities in few-shot setups, they still cannot match the performance of 'traditional' mapping-based approaches in the unsupervised scenario where no seed translation pairs are available, especially for lower-resource languages. To address this challenge with LLMs, we propose self-augmented in-context learning (SAIL) for unsupervised BLI: starting from a zero-shot prompt, SAIL iteratively induces a set of high-confidence word translation pairs for in-context learning (ICL) from an LLM, which it then reapplies to the same LLM in the ICL fashion. Our method shows substantial gains over zero-shot prompting of LLMs on two established BLI benchmarks spanning a wide range of language pairs, also outperforming mapping-based baselines across the board. In addition to achieving state-of-the-art unsupervised BLI performance, we also conduct comprehensive analyses on SAIL and discuss its limitations.

View on arXiv PDF Code

Similar