Going Beyond Word Matching: Syntax Improves In-context Example Selection for Machine Translation
This addresses the open issue of example selection for machine translation, offering a domain-specific improvement over previous word-level methods.
The paper tackled the problem of selecting informative examples for in-context learning in machine translation by proposing a syntax-based method using dependency trees and an ensemble strategy, achieving the highest COMET scores on 11 out of 12 translation directions between English and 6 languages.
In-context learning (ICL) is the trending prompting strategy in the era of large language models (LLMs), where a few examples are demonstrated to evoke LLMs' power for a given task. How to select informative examples remains an open issue. Previous works on in-context example selection for machine translation (MT) focus on superficial word-level features while ignoring deep syntax-level knowledge. In this paper, we propose a syntax-based in-context example selection method for MT, by computing the syntactic similarity between dependency trees using Polynomial Distance. In addition, we propose an ensemble strategy combining examples selected by both word-level and syntax-level criteria. Experimental results between English and 6 common languages indicate that syntax can effectively enhancing ICL for MT, obtaining the highest COMET scores on 11 out of 12 translation directions.