CLAug 27, 2018

Large Margin Neural Language Model

arXiv:1808.08987v11098 citations
Originality Highly original
AI Analysis

This addresses the issue of suboptimal training metrics for neural language models in tasks like speech recognition and machine translation, offering a novel method for task-specific re-scoring.

The paper tackled the problem that perplexity (PPL) may not be optimal for training neural language models in some tasks, and proposed a large margin criterion to enlarge the margin between good and bad sentences, resulting in up to 1.1 WER reduction for speech recognition and 1.0 BLEU increase for machine translation.

We propose a large margin criterion for training neural language models. Conventionally, neural language models are trained by minimizing perplexity (PPL) on grammatical sentences. However, we demonstrate that PPL may not be the best metric to optimize in some tasks, and further propose a large margin formulation. The proposed method aims to enlarge the margin between the "good" and "bad" sentences in a task-specific sense. It is trained end-to-end and can be widely applied to tasks that involve re-scoring of generated text. Compared with minimum-PPL training, our method gains up to 1.1 WER reduction for speech recognition and 1.0 BLEU increase for machine translation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes