CLAIOct 10, 2018

End-to-End Content and Plan Selection for Data-to-Text Generation

arXiv:1810.04700v11121 citations
Originality Incremental advance
AI Analysis

This work addresses data-to-text generation for applications like report automation, but it is incremental as it builds on existing models with minor enhancements.

The paper tackles the challenge of generating natural language from varying structured data by extending sequence-to-sequence models with techniques like copy attention and coverage decoding, and proposes a diverse ensembling training method, resulting in improved text quality across five automated metrics and human evaluation.

Learning to generate fluent natural language from structured data with neural networks has become an common approach for NLG. This problem can be challenging when the form of the structured data varies between examples. This paper presents a survey of several extensions to sequence-to-sequence models to account for the latent content selection process, particularly variants of copy attention and coverage decoding. We further propose a training method based on diverse ensembling to encourage models to learn distinct sentence templates during training. An empirical evaluation of these techniques shows an increase in the quality of generated text across five automated metrics, as well as human evaluation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes