CLMar 10, 2025

Contextual Cues in Machine Translation: Investigating the Potential of Multi-Source Input Strategies in LLMs and NMT Systems

Lia Shahnazaryan, Patrick Simianer, Joern Wuebker

arXiv:2503.07195v11 citationsh-index: 16RANLP

Originality Synthesis-oriented

AI Analysis

This work addresses translation quality improvements for domain-specific and linguistically distant language pairs, though it appears incremental in exploring established multi-source approaches.

The study investigated how multi-source input strategies using intermediate language translations as contextual cues affect machine translation quality, comparing GPT-4o and a traditional NMT system. Results showed significant improvements for domain-specific datasets and linguistically distant language pairs, with diminishing returns in high-variability benchmarks, and highlighted that strategic context language selection in shallow fusion enhances results.

We explore the impact of multi-source input strategies on machine translation (MT) quality, comparing GPT-4o, a large language model (LLM), with a traditional multilingual neural machine translation (NMT) system. Using intermediate language translations as contextual cues, we evaluate their effectiveness in enhancing English and Chinese translations into Portuguese. Results suggest that contextual information significantly improves translation quality for domain-specific datasets and potentially for linguistically distant language pairs, with diminishing returns observed in benchmarks with high linguistic variability. Additionally, we demonstrate that shallow fusion, a multi-source approach we apply within the NMT system, shows improved results when using high-resource languages as context for other translation pairs, highlighting the importance of strategic context language selection.

View on arXiv PDF

Similar