CLSep 16, 2020

Graph-to-Sequence Neural Machine Translation

arXiv:2009.07489v1
Originality Incremental advance
AI Analysis

This work addresses translation accuracy for machine translation systems by introducing a novel method to incorporate explicit graph structures, representing an incremental advancement over existing sequence-based models.

The paper tackles the problem of neural machine translation by proposing a graph-to-sequence model that explicitly captures graph information, resulting in improvements of 1.1 BLEU points on WMT14 English-German and 1.0 BLEU points on IWSLT14 German-English datasets.

Neural machine translation (NMT) usually works in a seq2seq learning way by viewing either source or target sentence as a linear sequence of words, which can be regarded as a special case of graph, taking words in the sequence as nodes and relationships between words as edges. In the light of the current NMT models more or less capture graph information among the sequence in a latent way, we present a graph-to-sequence model facilitating explicit graph information capturing. In detail, we propose a graph-based SAN-based NMT model called Graph-Transformer by capturing information of subgraphs of different orders in every layers. Subgraphs are put into different groups according to their orders, and every group of subgraphs respectively reflect different levels of dependency between words. For fusing subgraph representations, we empirically explore three methods which weight different groups of subgraphs of different orders. Results of experiments on WMT14 English-German and IWSLT14 German-English show that our method can effectively boost the Transformer with an improvement of 1.1 BLEU points on WMT14 English-German dataset and 1.0 BLEU points on IWSLT14 German-English dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes