CLMay 16, 2022

Towards Debiasing Translation Artifacts

Koel Dutta Chowdhury, Rricha Jalota, Cristina España-Bonet, Josef van Genabith

arXiv:2205.08001v131.8629 citationsh-index: 42Has Code

Originality Incremental advance

AI Analysis

This addresses translation bias in cross-lingual NLP, which can degrade performance in tasks like NLI, though it is incremental as it extends an existing bias-removal technique.

The paper tackled the problem of translation artifacts (translationese) affecting cross-lingual NLP tasks by proposing a novel debiasing approach using Iterative Null-space Projection, resulting in reduced translationese at sentence and word levels and improved accuracy on a natural language inference task.

Cross-lingual natural language processing relies on translation, either by humans or machines, at different levels, from translating training data to translating test sets. However, compared to original texts in the same language, translations possess distinct qualities referred to as translationese. Previous research has shown that these translation artifacts influence the performance of a variety of cross-lingual tasks. In this work, we propose a novel approach to reducing translationese by extending an established bias-removal technique. We use the Iterative Null-space Projection (INLP) algorithm, and show by measuring classification accuracy before and after debiasing, that translationese is reduced at both sentence and word level. We evaluate the utility of debiasing translationese on a natural language inference (NLI) task, and show that by reducing this bias, NLI accuracy improves. To the best of our knowledge, this is the first study to debias translationese as represented in latent embedding space.

View on arXiv PDF Code

Similar