CLOct 24, 2024

GrammaMT: Improving Machine Translation with Grammar-Informed In-Context Learning

arXiv:2410.18702v210 citationsh-index: 13Has CodeACL
Originality Incremental advance
AI Analysis

This work addresses machine translation challenges, particularly for low-resource and endangered languages, by leveraging linguistic annotations in a training-free manner, though it is incremental as it builds on existing prompting methods.

The paper tackles the problem of improving machine translation by introducing GrammaMT, a grammar-informed in-context learning approach that uses Interlinear Glossed Text (IGT) for prompting, resulting in enhanced translation performance across low- to high-resource languages, with ablation studies showing potential boosts of over 17 BLEU points.

We introduce GrammaMT, a grammatically-aware prompting approach for machine translation that uses Interlinear Glossed Text (IGT), a common form of linguistic description providing morphological and lexical annotations for source sentences. GrammaMT proposes three prompting strategies: gloss-shot, chain-gloss and model-gloss. All are training-free, requiring only a few examples that involve minimal effort to collect, and making them well-suited for low-resource setups. Experiments show that GrammaMT enhances translation performance on open-source instruction-tuned LLMs for various low- to high-resource languages across three benchmarks: (1) the largest IGT corpus, (2) the challenging 2023 SIGMORPHON Shared Task data over endangered languages, and (3) even in an out-of-domain setting with FLORES. Moreover, ablation studies reveal that leveraging gloss resources could substantially boost MT performance (by over 17 BLEU points) if LLMs accurately generate or access input sentence glosses.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes