On Compositionality in Neural Machine Translation
This addresses the challenge of compositionality for neural machine translation systems, but it is incremental as it builds on existing models with a new pre-training approach.
The paper tackled the problem of compositionality in neural machine translation by evaluating a standard sequence-to-sequence model on productivity and systematicity, finding that poor encoder representations were a bottleneck, and proposed a pre-training mechanism that significantly improved BLEU scores.
We investigate two specific manifestations of compositionality in Neural Machine Translation (NMT) : (1) Productivity - the ability of the model to extend its predictions beyond the observed length in training data and (2) Systematicity - the ability of the model to systematically recombine known parts and rules. We evaluate a standard Sequence to Sequence model on tests designed to assess these two properties in NMT. We quantitatively demonstrate that inadequate temporal processing, in the form of poor encoder representations is a bottleneck for both Productivity and Systematicity. We propose a simple pre-training mechanism which alleviates model performance on the two properties and leads to a significant improvement in BLEU scores.