CLNov 26, 2017

Learning to Remember Translation History with a Continuous Cache

arXiv:1711.09367v11167 citations
Originality Incremental advance
AI Analysis

This work addresses the limitation of missing document-level context in NMT for improved translation consistency, though it is incremental as it builds on existing memory-augmented approaches.

The authors tackled the problem of neural machine translation (NMT) models translating sentences in isolation by proposing a light-weight cache-like memory network to store translation history, resulting in effective adaptation across multiple domains with negligible computational cost.

Existing neural machine translation (NMT) models generally translate sentences in isolation, missing the opportunity to take advantage of document-level information. In this work, we propose to augment NMT models with a very light-weight cache-like memory network, which stores recent hidden representations as translation history. The probability distribution over generated words is updated online depending on the translation history retrieved from the memory, endowing NMT models with the capability to dynamically adapt over time. Experiments on multiple domains with different topics and styles show the effectiveness of the proposed approach with negligible impact on the computational cost.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes