CLAug 7, 2023

Negative Lexical Constraints in Neural Machine Translation

Josef Jon, Dušan Variš, Michal Novák, João Paulo Aires, Ondřej Bojar

arXiv:2308.03601v121.2133 citationsh-index: 48

Originality Incremental advance

AI Analysis

This addresses a practical problem for machine translation users needing to avoid specific words, but it is incremental as it builds on existing constraint methods.

This paper tackled the problem of negative lexical constraints in English-to-Czech neural machine translation, where models sometimes evade constraints by generating different word forms. The proposed method of training with stemmed constraints improved constraint adherence, though the issue persisted in many cases.

This paper explores negative lexical constraining in English to Czech neural machine translation. Negative lexical constraining is used to prohibit certain words or expressions in the translation produced by the neural translation model. We compared various methods based on modifying either the decoding process or the training data. The comparison was performed on two tasks: paraphrasing and feedback-based translation refinement. We also studied to which extent these methods "evade" the constraints presented to the model (usually in the dictionary form) by generating a different surface form of a given constraint.We propose a way to mitigate the issue through training with stemmed negative constraints to counter the model's ability to induce a variety of the surface forms of a word that can result in bypassing the constraint. We demonstrate that our method improves the constraining, although the problem still persists in many cases.

View on arXiv PDF

Similar