CLLGJan 12, 2020

Stochastic Natural Language Generation Using Dependency Information

arXiv:2001.03897v1
AI Analysis

This work addresses the challenge of generating natural language text for applications like dialogue systems and data-to-text tasks, but it is incremental as it builds on existing corpus-based and dependency-based methods.

The authors tackled the problem of natural language generation by proposing a stochastic corpus-based model that encodes dependency relations to produce new dependency trees and generate utterances. Their model outperformed corpus-based state-of-the-art methods on tabular datasets and achieved comparable results with neural approaches on dialogue act, E2E, and WebNLG datasets in terms of BLEU and ERR metrics, with human evaluation confirming high-quality outputs in informativeness and naturalness.

This article presents a stochastic corpus-based model for generating natural language text. Our model first encodes dependency relations from training data through a feature set, then concatenates these features to produce a new dependency tree for a given meaning representation, and finally generates a natural language utterance from the produced dependency tree. We test our model on nine domains from tabular, dialogue act and RDF format. Our model outperforms the corpus-based state-of-the-art methods trained on tabular datasets and also achieves comparable results with neural network-based approaches trained on dialogue act, E2E and WebNLG datasets for BLEU and ERR evaluation metrics. Also, by reporting Human Evaluation results, we show that our model produces high-quality utterances in aspects of informativeness and naturalness as well as quality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes