CL LGNov 22, 2019

Neuron Interaction Based Representation Composition for Neural Machine Translation

Jian Li, Xing Wang, Baosong Yang, Shuming Shi, Michael R. Lyu, Zhaopeng Tu

arXiv:1911.09877v11.518 citations

Originality Incremental advance

AI Analysis

This work addresses the challenge of improving translation quality for NLP practitioners by proposing an incremental method to enhance representation composition in neural machine translation.

The paper tackles the problem of capturing complex linguistic information in neural machine translation by modeling strong interactions among neurons in representation vectors, resulting in consistent performance improvements over the SOTA Transformer baseline on WMT14 English-German and English-French tasks.

Recent NLP studies reveal that substantial linguistic information can be attributed to single neurons, i.e., individual dimensions of the representation vectors. We hypothesize that modeling strong interactions among neurons helps to better capture complex information by composing the linguistic properties embedded in individual neurons. Starting from this intuition, we propose a novel approach to compose representations learned by different components in neural machine translation (e.g., multi-layer networks or multi-head attention), based on modeling strong interactions among neurons in the representation vectors. Specifically, we leverage bilinear pooling to model pairwise multiplicative interactions among individual neurons, and a low-rank approximation to make the model computationally feasible. We further propose extended bilinear pooling to incorporate first-order representations. Experiments on WMT14 English-German and English-French translation tasks show that our model consistently improves performances over the SOTA Transformer baseline. Further analyses demonstrate that our approach indeed captures more syntactic and semantic information as expected.

View on arXiv PDF

Similar