CLJun 28, 2017

Data-driven Natural Language Generation: Paving the Road to Success

Jekaterina Novikova, Ondřej Dušek, Verena Rieser

arXiv:1706.09433v11.64 citations

Originality Incremental advance

AI Analysis

This addresses bottlenecks in commercializing statistical machine learning for natural language generation, but it is incremental as it builds on existing evaluation and data issues.

The paper tackles the lack of reliable automatic evaluation metrics and scarcity of high-quality in-domain corpora for natural language generation, proposing a new evaluation metric and a framework for corpus development.

We argue that there are currently two major bottlenecks to the commercial use of statistical machine learning approaches for natural language generation (NLG): (a) The lack of reliable automatic evaluation metrics for NLG, and (b) The scarcity of high quality in-domain corpora. We address the first problem by thoroughly analysing current evaluation metrics and motivating the need for a new, more reliable metric. The second problem is addressed by presenting a novel framework for developing and evaluating a high quality corpus for NLG training.

View on arXiv PDF

Similar