CL AIDec 19, 2020

Finding Sparse Structures for Domain Specific Neural Machine Translation

Jianze Liang, Chengqi Zhao, Mingxuan Wang, Xipeng Qiu, Lei Li

arXiv:2012.10586v22.69 citationsHas Code

Originality Incremental advance

AI Analysis

This work is significant for researchers and practitioners in neural machine translation who need to adapt models to new domains efficiently while maintaining general domain performance.

This paper addresses the problem of adapting neural machine translation models to specific domains without degrading performance on general domains or overfitting to the target domain. The proposed Prune-Tune method learns tiny domain-specific sub-networks, outperforming strong competitors in target domain test sets without sacrificing general domain quality in both single and multi-domain settings.

Neural machine translation often adopts the fine-tuning approach to adapt to specific domains. However, nonrestricted fine-tuning can easily degrade on the general domain and over-fit to the target domain. To mitigate the issue, we propose Prune-Tune, a novel domain adaptation method via gradual pruning. It learns tiny domain-specific sub-networks during fine-tuning on new domains. Prune-Tune alleviates the over-fitting and the degradation problem without model modification. Furthermore, Prune-Tune is able to sequentially learn a single network with multiple disjoint domain-specific sub-networks for multiple domains. Empirical experiment results show that Prune-Tune outperforms several strong competitors in the target domain test set without sacrificing the quality on the general domain in both single and multi-domain settings. The source code and data are available at https://github.com/ohlionel/Prune-Tune.

View on arXiv PDF Code

Similar