CLMar 28, 2024

Going Beyond Word Matching: Syntax Improves In-context Example Selection for Machine Translation

arXiv:2403.19285v22 citationsh-index: 3
Originality Incremental advance
AI Analysis

This addresses the open issue of example selection for machine translation, offering a domain-specific improvement over previous word-level methods.

The paper tackled the problem of selecting informative examples for in-context learning in machine translation by proposing a syntax-based method using dependency trees and an ensemble strategy, achieving the highest COMET scores on 11 out of 12 translation directions between English and 6 languages.

In-context learning (ICL) is the trending prompting strategy in the era of large language models (LLMs), where a few examples are demonstrated to evoke LLMs' power for a given task. How to select informative examples remains an open issue. Previous works on in-context example selection for machine translation (MT) focus on superficial word-level features while ignoring deep syntax-level knowledge. In this paper, we propose a syntax-based in-context example selection method for MT, by computing the syntactic similarity between dependency trees using Polynomial Distance. In addition, we propose an ensemble strategy combining examples selected by both word-level and syntax-level criteria. Experimental results between English and 6 common languages indicate that syntax can effectively enhancing ICL for MT, obtaining the highest COMET scores on 11 out of 12 translation directions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes