CLLGNEApr 28, 2015

Lexical Translation Model Using a Deep Neural Network Architecture

arXiv:1504.07395v13 citations
Originality Incremental advance
AI Analysis

This work addresses data sparsity and dependency modeling in machine translation for language pairs, though it appears incremental as it builds on existing models.

The paper tackled the problem of lexical translation by integrating deep neural networks into the Discriminative Word Lexicon model to leverage non-linear dependencies and reduce data sparsity, resulting in a performance improvement of up to 0.5 BLEU points for three language pairs on the TED translation task.

In this paper we combine the advantages of a model using global source sentence contexts, the Discriminative Word Lexicon, and neural networks. By using deep neural networks instead of the linear maximum entropy model in the Discriminative Word Lexicon models, we are able to leverage dependencies between different source words due to the non-linearity. Furthermore, the models for different target words can share parameters and therefore data sparsity problems are effectively reduced. By using this approach in a state-of-the-art translation system, we can improve the performance by up to 0.5 BLEU points for three different language pairs on the TED translation task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes