CLJun 10, 2021

AUGNLG: Few-shot Natural Language Generation using Self-trained Data Augmentation

Xinnuo Xu, Guoyin Wang, Young-Bum Kim, Sungjin Lee

arXiv:2106.05589v131.6714 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses scalability issues in large-scale conversational systems with many intents and slots, though it is incremental as it builds on existing few-shot neural NLG approaches.

The paper tackles the problem of scaling natural language generation for task-oriented dialogue systems by proposing AUGNLG, a data augmentation method that automatically creates MR-to-Text data from open-domain texts, achieving state-of-the-art performance on FewShotWOZ with improved BLEU and Slot Error Rate scores.

Natural Language Generation (NLG) is a key component in a task-oriented dialogue system, which converts the structured meaning representation (MR) to the natural language. For large-scale conversational systems, where it is common to have over hundreds of intents and thousands of slots, neither template-based approaches nor model-based approaches are scalable. Recently, neural NLGs started leveraging transfer learning and showed promising results in few-shot settings. This paper proposes AUGNLG, a novel data augmentation approach that combines a self-trained neural retrieval model with a few-shot learned NLU model, to automatically create MR-to-Text data from open-domain texts. The proposed system mostly outperforms the state-of-the-art methods on the FewShotWOZ data in both BLEU and Slot Error Rate. We further confirm improved results on the FewShotSGD data and provide comprehensive analysis results on key components of our system. Our code and data are available at https://github.com/XinnuoXu/AugNLG.

View on arXiv PDF Code

Similar