CLLGNov 2, 2018

An Empirical Exploration of Curriculum Learning for Neural Machine Translation

arXiv:1811.00739v1124 citations
Originality Synthesis-oriented
AI Analysis

This work addresses training efficiency for machine translation practitioners, but it is incremental as it focuses on empirical exploration of existing curriculum learning methods.

The paper tackled the problem of expensive training for neural machine translation by exploring curriculum learning to improve convergence time without sacrificing translation quality, achieving faster training with no loss in performance.

Machine translation systems based on deep neural networks are expensive to train. Curriculum learning aims to address this issue by choosing the order in which samples are presented during training to help train better models faster. We adopt a probabilistic view of curriculum learning, which lets us flexibly evaluate the impact of curricula design, and perform an extensive exploration on a German-English translation task. Results show that it is possible to improve convergence time at no loss in translation quality. However, results are highly sensitive to the choice of sample difficulty criteria, curriculum schedule and other hyperparameters.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes