CLJun 19, 2024

LLMs Are Zero-Shot Context-Aware Simultaneous Translators

Roman Koshkin, Katsuhito Sudoh, Satoshi Nakamura

arXiv:2406.13476v315.427 citationsh-index: 44Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of building efficient and accurate simultaneous translation systems for multilingual and technical domains, offering a zero-shot approach that avoids resource-intensive training.

The paper shows that open-source large language models (LLMs) achieve performance comparable to or better than state-of-the-art baselines in simultaneous machine translation (SiMT) tasks without any training, and that adding minimal background information further improves results, especially for technical content.

The advent of transformers has fueled progress in machine translation. More recently large language models (LLMs) have come to the spotlight thanks to their generality and strong performance in a wide range of language tasks, including translation. Here we show that open-source LLMs perform on par with or better than some state-of-the-art baselines in simultaneous machine translation (SiMT) tasks, zero-shot. We also demonstrate that injection of minimal background information, which is easy with an LLM, brings further performance gains, especially on challenging technical subject-matter. This highlights LLMs' potential for building next generation of massively multilingual, context-aware and terminologically accurate SiMT systems that require no resource-intensive training or fine-tuning.

View on arXiv PDF Code

Similar