Summary Grounded Conversation Generation
This work addresses the problem of efficient and high-quality conversation data generation for researchers and developers, though it is incremental as it builds on existing language models.
The paper tackles the challenge of generating entire conversations from summaries using pre-trained language models, achieving improved conversation summarization accuracy through dataset augmentation with generated conversations.
Many conversation datasets have been constructed in the recent years using crowdsourcing. However, the data collection process can be time consuming and presents many challenges to ensure data quality. Since language generation has improved immensely in recent years with the advancement of pre-trained language models, we investigate how such models can be utilized to generate entire conversations, given only a summary of a conversation as the input. We explore three approaches to generate summary grounded conversations, and evaluate the generated conversations using automatic measures and human judgements. We also show that the accuracy of conversation summarization can be improved by augmenting a conversation summarization dataset with generated conversations.