CLOct 23, 2020

Rapid Domain Adaptation for Machine Translation with Monolingual Data

arXiv:2010.12652v19 citations
Originality Incremental advance
AI Analysis

This addresses the need for timely and accurate translation in emerging domains like COVID-19, where parallel data is scarce, but it is incremental as it builds on unsupervised translation methods.

The paper tackles the problem of adapting machine translation systems to new domains with limited parallel data, achieving significant gains in in-domain translation quality without compromising general-domain performance.

One challenge of machine translation is how to quickly adapt to unseen domains in face of surging events like COVID-19, in which case timely and accurate translation of in-domain information into multiple languages is critical but little parallel data is available yet. In this paper, we propose an approach that enables rapid domain adaptation from the perspective of unsupervised translation. Our proposed approach only requires in-domain monolingual data and can be quickly applied to a preexisting translation system trained on general domain, reaching significant gains on in-domain translation quality with little or no drop on general-domain. We also propose an effective procedure of simultaneous adaptation for multiple domains and languages. To the best of our knowledge, this is the first attempt that aims to address unsupervised multilingual domain adaptation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes