Neural Machine Translation for Bilingually Scarce Scenarios: A Deep Multi-task Learning Approach
This addresses the challenge of bilingually scarce scenarios for machine translation practitioners, though it is incremental as it builds on existing multi-task learning methods.
The paper tackles the problem of neural machine translation for language pairs with limited parallel training data by using monolingual linguistic resources through a multi-task learning approach, resulting in effective translation models for English-to-French, English-to-Farsi, and English-to-Vietnamese tasks.
Neural machine translation requires large amounts of parallel training text to learn a reasonable-quality translation model. This is particularly inconvenient for language pairs for which enough parallel text is not available. In this paper, we use monolingual linguistic resources in the source side to address this challenging problem based on a multi-task learning approach. More specifically, we scaffold the machine translation task on auxiliary tasks including semantic parsing, syntactic parsing, and named-entity recognition. This effectively injects semantic and/or syntactic knowledge into the translation model, which would otherwise require a large amount of training bitext. We empirically evaluate and show the effectiveness of our multi-task learning approach on three translation tasks: English-to-French, English-to-Farsi, and English-to-Vietnamese.