CL AIMar 16, 2022

Multi-Stage Prompting for Knowledgeable Dialogue Generation

Zihan Liu, Mostofa Patwary, Ryan Prenger, Shrimai Prabhumoye, Wei Ping, Mohammad Shoeybi, Bryan Catanzaro

arXiv:2203.08745v132.3656 citationsh-index: 59Has Code

Originality Highly original

AI Analysis

This addresses the problem of generalization and efficiency in dialogue systems for AI applications, offering a novel method that reduces the need for separate finetuned checkpoints.

The paper tackles the limitations of knowledge-grounded dialogue systems by proposing a multi-stage prompting approach that uses a single pretrained language model to generate knowledge and responses, resulting in improvements such as a 5.8% outperformance over retrieval-based models in knowledge relevance and correctness and up to 10% gains in response knowledgeability and engagement.

Existing knowledge-grounded dialogue systems typically use finetuned versions of a pretrained language model (LM) and large-scale knowledge bases. These models typically fail to generalize on topics outside of the knowledge base, and require maintaining separate potentially large checkpoints each time finetuning is needed. In this paper, we aim to address these limitations by leveraging the inherent knowledge stored in the pretrained LM as well as its powerful generation ability. We propose a multi-stage prompting approach to generate knowledgeable responses from a single pretrained LM. We first prompt the LM to generate knowledge based on the dialogue context. Then, we further prompt it to generate responses based on the dialogue context and the previously generated knowledge. Results show that our knowledge generator outperforms the state-of-the-art retrieval-based model by 5.8% when combining knowledge relevance and correctness. In addition, our multi-stage prompting outperforms the finetuning-based dialogue model in terms of response knowledgeability and engagement by up to 10% and 5%, respectively. Furthermore, we scale our model up to 530 billion parameters and show that larger LMs improve the generation correctness score by up to 10%, and response relevance, knowledgeability and engagement by up to 10%. Our code is available at: https://github.com/NVIDIA/Megatron-LM.

View on arXiv PDF Code

Similar