CLMay 11, 2023

Chain-of-Dictionary Prompting Elicits Translation in Large Language Models

arXiv:2305.06575v645 citations
Originality Highly original
AI Analysis

This addresses a practical limitation in using LLMs for translation in low-resource language scenarios, offering a novel method to improve performance where traditional in-context learning fails.

The paper tackles the problem of large language models struggling with rare word translation in low-resource languages by proposing Chain-of-Dictionary (CoD) prompting, which augments models with multilingual dictionary chains to elicit translation abilities, resulting in gains of up to 13x chrF++ points (e.g., from 3.08 to 42.63 for English to Serbian).

Large language models (LLMs) have shown surprisingly good performance in multilingual neural machine translation (MNMT) even when trained without parallel data. Yet, despite the fact that the amount of training data is gigantic, they still struggle with translating rare words, particularly for low-resource languages. Even worse, it is usually unrealistic to retrieve relevant demonstrations for in-context learning with low-resource languages on LLMs, which restricts the practical use of LLMs for translation -- how should we mitigate this problem? To this end, we present a novel method, CoD, which augments LLMs with prior knowledge with the chains of multilingual dictionaries for a subset of input words to elicit translation abilities for LLMs. Extensive experiments indicate that augmenting ChatGPT with CoD elicits large gains by up to 13x chrF++ points for MNMT (3.08 to 42.63 for English to Serbian written in Cyrillic script) on FLORES-200 full devtest set. We further demonstrate the importance of chaining the multilingual dictionaries, as well as the superiority of CoD to few-shot demonstration for low-resource languages.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes