CLMar 12

Translationese as a Rational Response to Translation Task Difficulty

arXiv:2603.12050v112.6h-index: 8

Predicted impact top 93% in CL · last 90 daysOriginality Incremental advance

AI Analysis

This addresses the lack of a unified explanatory account for translationese, which is an incremental contribution to computational linguistics and translation studies.

The study tackled the problem of explaining translationese by testing whether it can be predicted from quantifiable measures of translation task difficulty, finding that translationese can be partly explained by difficulty, especially in English-to-German, with cross-lingual transfer difficulty contributing more than source-text complexity in most cases.

Translations systematically diverge from texts originally produced in the target language, a phenomenon widely referred to as translationese. Translationese has been attributed to production tendencies (e.g. interference, simplification), socio-cultural variables, and language-pair effects, yet a unified explanatory account is still lacking. We propose that translationese reflects cognitive load inherent in the translation task itself. We test whether observable translationese can be predicted from quantifiable measures of translation task difficulty. Translationese is operationalised as a segment-level translatedness score produced by an automatic classifier. Translation task difficulty is conceptualised as comprising source-text and cross-lingual transfer components, operationalised mainly through information-theoretic metrics based on LLM surprisal, complemented by established syntactic and semantic alternatives. We use a bidirectional English-German corpus comprising written and spoken subcorpora. Results indicate that translationese can be partly explained by translation task difficulty, especially in English-to-German. For most experiments, cross-lingual transfer difficulty contributes more than source-text complexity. Information-theoretic indicators match or outperform traditional features in written mode, but offer no advantage in spoken mode. Source-text syntactic complexity and translation-solution entropy emerged as the strongest predictors of translationese across language pairs and modes.

View on arXiv PDF

Similar