CLJun 5, 2025

Please Translate Again: Two Simple Experiments on Whether Human-Like Reasoning Helps Translation

arXiv:2506.04521v213 citationsh-index: 17EMNLP
Originality Synthesis-oriented
AI Analysis

This work challenges the assumption that human-like reasoning strategies are optimal for LLM-based translation, suggesting a divergence in effective approaches.

The paper investigates whether human-like reasoning via Chain-of-Thought decomposition improves translation in LLMs, finding no clear evidence for its effectiveness and showing that a simple 'translate again' self-refinement prompt yields better results than step-by-step prompting.

Large Language Models (LLMs) demonstrate strong reasoning capabilities for many tasks, often by explicitly decomposing the task via Chain-of-Thought (CoT) reasoning. Recent work on LLM-based translation designs hand-crafted prompts to decompose translation, or trains models to incorporate intermediate steps. Translating Step-by-step (Briakou et al., 2024), for instance, introduces a multi-step prompt with decomposition and refinement of translation with LLMs, which achieved state-of-the-art results on WMT24 test data. In this work, we scrutinise this strategy's effectiveness. Empirically, we find no clear evidence that performance gains stem from explicitly decomposing the translation process via CoT, at least for the models on test; and we show prompting LLMs to 'translate again' and self-refine yields even better results than human-like step-by-step prompting. While the decomposition influences translation behaviour, faithfulness to the decomposition has both positive and negative effects on translation. Our analysis therefore suggests a divergence between the optimal translation strategies for humans and LLMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes