CLMLOct 7, 2016

Morphology Generation for Statistical Machine Translation using Deep Learning Techniques

arXiv:1610.02209v2
Originality Incremental advance
AI Analysis

This addresses morphology challenges in unbalanced languages like Chinese-Spanish for machine translation, but it is incremental as it builds on existing deep learning techniques.

The paper tackles morphology generation for machine translation by decoupling it from translation and simplifying morphology, achieving over 98% accuracy in gender classification and 93% in number classification, with a 0.7 METEOR improvement in translation.

Morphology in unbalanced languages remains a big challenge in the context of machine translation. In this paper, we propose to de-couple machine translation from morphology generation in order to better deal with the problem. We investigate the morphology simplification with a reasonable trade-off between expected gain and generation complexity. For the Chinese-Spanish task, optimum morphological simplification is in gender and number. For this purpose, we design a new classification architecture which, compared to other standard machine learning techniques, obtains the best results. This proposed neural-based architecture consists of several layers: an embedding, a convolutional followed by a recurrent neural network and, finally, ends with sigmoid and softmax layers. We obtain classification results over 98% accuracy in gender classification, over 93% in number classification, and an overall translation improvement of 0.7 METEOR.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes