CLLGOct 12, 2020

Improving Text Generation with Student-Forcing Optimal Transport

arXiv:2010.05994v1997 citations
Originality Incremental advance
AI Analysis

This addresses a key challenge in text generation for NLP applications, but it is incremental as it builds on existing optimal transport methods.

The paper tackles exposure bias in neural language models by using optimal transport to align sequences generated during training and testing, achieving improved performance on machine translation, text summarization, and text generation tasks.

Neural language models are often trained with maximum likelihood estimation (MLE), where the next word is generated conditioned on the ground-truth word tokens. During testing, however, the model is instead conditioned on previously generated tokens, resulting in what is termed exposure bias. To reduce this gap between training and testing, we propose using optimal transport (OT) to match the sequences generated in these two modes. An extension is further proposed to improve the OT learning, based on the structural and contextual information of the text sequences. The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes