Translating Step-by-Step: Decomposing the Translation Process for Improved Translation Quality of Long-Form Texts
This addresses the challenge of high-quality long-form translation for users of machine translation systems, though it appears incremental as it builds on established translation studies processes.
The paper tackles the problem of improving translation quality for long-form texts by decomposing the translation process into multiple interactive steps (pre-translation research, drafting, refining, proofreading) using language models, resulting in state-of-the-art results on WMT2024 with large quality improvements over conventional approaches.
In this paper we present a step-by-step approach to long-form text translation, drawing on established processes in translation studies. Instead of viewing machine translation as a single, monolithic task, we propose a framework that engages language models in a multi-turn interaction, encompassing pre-translation research, drafting, refining, and proofreading, resulting in progressively improved translations. Extensive automatic evaluations using Gemini 1.5 Pro across ten language pairs show that translating step-by-step yields large translation quality improvements over conventional zero-shot prompting approaches and earlier human-like baseline strategies, resulting in state-of-the-art results on WMT2024.