CLOct 23, 2023

That was the last straw, we need more: Are Translation Systems Sensitive to Disambiguating Context?

arXiv:2310.14610v1134 citationsh-index: 22
Originality Incremental advance
AI Analysis

This addresses a specific challenge in machine translation for ambiguous text, with incremental insights into model sensitivity.

The study tackled the problem of translation systems handling semantic ambiguities in English idioms, finding that current MT models consistently translate idioms literally despite disambiguating context, while language models are more context-aware but show disparities across target languages.

The translation of ambiguous text presents a challenge for translation systems, as it requires using the surrounding context to disambiguate the intended meaning as much as possible. While prior work has studied ambiguities that result from different grammatical features of the source and target language, we study semantic ambiguities that exist in the source (English in this work) itself. In particular, we focus on idioms that are open to both literal and figurative interpretations (e.g., goose egg), and collect TIDE, a dataset of 512 pairs of English sentences containing idioms with disambiguating context such that one is literal (it laid a goose egg) and another is figurative (they scored a goose egg, as in a score of zero). In experiments, we compare MT-specific models and language models for (i) their preference when given an ambiguous subsentence, (ii) their sensitivity to disambiguating context, and (iii) the performance disparity between figurative and literal source sentences. We find that current MT models consistently translate English idioms literally, even when the context suggests a figurative interpretation. On the other hand, LMs are far more context-aware, although there remain disparities across target languages. Our findings underline the potential of LMs as a strong backbone for context-aware translation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes