CLNov 14, 2015

Character-based Neural Machine Translation

arXiv:1511.04586v197 citations
Originality Incremental advance
AI Analysis

This approach addresses translation challenges by handling unseen word forms and reducing preprocessing needs, though it is incremental as it matches rather than surpasses existing word-based methods.

The paper tackles neural machine translation by processing sentences as character sequences, composing characters into words for translation and generating words character by character, achieving results comparable to conventional word-based models.

We introduce a neural machine translation model that views the input and output sentences as sequences of characters rather than words. Since word-level information provides a crucial source of bias, our input model composes representations of character sequences into representations of words (as determined by whitespace boundaries), and then these are translated using a joint attention/translation model. In the target language, the translation is modeled as a sequence of word vectors, but each word is generated one character at a time, conditional on the previous character generations in each word. As the representation and generation of words is performed at the character level, our model is capable of interpreting and generating unseen word forms. A secondary benefit of this approach is that it alleviates much of the challenges associated with preprocessing/tokenization of the source and target languages. We show that our model can achieve translation results that are on par with conventional word-based models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes