CLApr 15, 2017

Graph Convolutional Encoders for Syntax-aware Neural Machine Translation

arXiv:1704.04675v4517 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of improving translation accuracy for languages with complex syntax, though it is incremental as it builds on existing encoder-decoder models.

The authors tackled the problem of incorporating syntactic structure into neural machine translation by using graph-convolutional networks (GCNs) based on predicted dependency trees, resulting in substantial improvements over syntax-agnostic versions in English-German and English-Czech translation experiments.

We present a simple and effective approach to incorporating syntactic structure into neural attention-based encoder-decoder models for machine translation. We rely on graph-convolutional networks (GCNs), a recent class of neural networks developed for modeling graph-structured data. Our GCNs use predicted syntactic dependency trees of source sentences to produce representations of words (i.e. hidden states of the encoder) that are sensitive to their syntactic neighborhoods. GCNs take word representations as input and produce word representations as output, so they can easily be incorporated as layers into standard encoders (e.g., on top of bidirectional RNNs or convolutional neural networks). We evaluate their effectiveness with English-German and English-Czech translation experiments for different types of encoders and observe substantial improvements over their syntax-agnostic versions in all the considered setups.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes