CLAug 29, 2017

Neural Machine Translation Training in a Multi-Domain Scenario

Hassan Sajjad, Nadir Durrani, Fahim Dalvi, Yonatan Belinkov, Stephan Vogel

arXiv:1708.08712v333.9661 citations

Originality Incremental advance

AI Analysis

This work addresses domain adaptation challenges in machine translation, offering practical insights for improving translation systems across diverse domains, though it is incremental in nature.

The paper tackles the problem of training neural machine translation systems in multi-domain scenarios by comparing methods like data concatenation, model stacking, data selection, and ensemble models, finding that initial training on concatenated out-of-domain data followed by fine-tuning on in-domain data yields the best translation quality.

In this paper, we explore alternative ways to train a neural machine translation system in a multi-domain scenario. We investigate data concatenation (with fine tuning), model stacking (multi-level fine tuning), data selection and multi-model ensemble. Our findings show that the best translation quality can be achieved by building an initial system on a concatenation of available out-of-domain data and then fine-tuning it on in-domain data. Model stacking works best when training begins with the furthest out-of-domain data and the model is incrementally fine-tuned with the next furthest domain and so on. Data selection did not give the best results, but can be considered as a decent compromise between training time and translation quality. A weighted ensemble of different individual models performed better than data selection. It is beneficial in a scenario when there is no time for fine-tuning an already trained model.

View on arXiv PDF

Similar