CLJun 28, 2017

The E2E Dataset: New Challenges For End-to-End Generation

arXiv:1706.09254v21241 citations
Originality Synthesis-oriented
AI Analysis

This provides a more challenging benchmark for researchers in natural language generation, though it is incremental as it builds on existing datasets.

The paper introduces the E2E dataset, a new and larger dataset for end-to-end natural language generation in the restaurant domain, which presents challenges like increased lexical richness and content selection, and establishes a baseline to illustrate these difficulties.

This paper describes the E2E data, a new dataset for training end-to-end, data-driven natural language generation systems in the restaurant domain, which is ten times bigger than existing, frequently used datasets in this area. The E2E dataset poses new challenges: (1) its human reference texts show more lexical richness and syntactic variation, including discourse phenomena; (2) generating from this set requires content selection. As such, learning from this dataset promises more natural, varied and less template-like system utterances. We also establish a baseline on this dataset, which illustrates some of the difficulties associated with this data.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes