CLAIOct 24, 2023

Dissecting In-Context Learning of Translations in GPTs

arXiv:2310.15987v14 citationsh-index: 23
Originality Incremental advance
AI Analysis

This work addresses the problem of understanding and enhancing in-context learning for machine translation in LLMs, offering incremental improvements for researchers and practitioners in NLP.

The paper investigates the role of demonstration attributes in in-context learning for machine translation using GPT-3, finding that target-side perturbations significantly reduce translation quality while source-side perturbations have little impact, and proposes a Zero-Shot-Context method that improves zero-shot translation performance to be competitive with few-shot approaches.

Most of the recent work in leveraging Large Language Models (LLMs) such as GPT-3 for Machine Translation (MT) has focused on selecting the few-shot samples for prompting. In this work, we try to better understand the role of demonstration attributes for the in-context learning of translations through perturbations of high-quality, in-domain demonstrations. We find that asymmetric perturbation of the source-target mappings yield vastly different results. We show that the perturbation of the source side has surprisingly little impact, while target perturbation can drastically reduce translation quality, suggesting that it is the output text distribution that provides the most important learning signal during in-context learning of translations. We propose a method named Zero-Shot-Context to add this signal automatically in Zero-Shot prompting. We demonstrate that it improves upon the zero-shot translation performance of GPT-3, even making it competitive with few-shot prompted translations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes