Document-level Neural Machine Translation with Document Embeddings
This work addresses the need for better document-level context in machine translation, offering a domain-specific solution that is incremental in nature.
The paper tackled the problem of standard neural machine translation lacking document-level context by introducing multiple forms of document embeddings to model deeper and richer context, resulting in significant improvements in translation performance over strong baselines.
Standard neural machine translation (NMT) is on the assumption of document-level context independent. Most existing document-level NMT methods are satisfied with a smattering sense of brief document-level information, while this work focuses on exploiting detailed document-level context in terms of multiple forms of document embeddings, which is capable of sufficiently modeling deeper and richer document-level context. The proposed document-aware NMT is implemented to enhance the Transformer baseline by introducing both global and local document-level clues on the source end. Experiments show that the proposed method significantly improves the translation performance over strong baselines and other related studies.