CLJun 1, 2023

Diverse and Faithful Knowledge-Grounded Dialogue Generation via Sequential Posterior Inference

Yan Xu, Deqian Kong, Dehong Xu, Ziwei Ji, Bo Pang, Pascale Fung, Ying Nian Wu

arXiv:2306.01153v22.99 citationsh-index: 18Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of creating trustworthy dialogue systems by improving response faithfulness and diversity, though it is incremental in advancing joint optimization methods.

The paper tackles the problem of generating diverse and faithful dialogue responses grounded in factual knowledge by proposing Sequential Posterior Inference (SPI), an end-to-end framework that jointly optimizes knowledge selection and response generation, outperforming previous baselines on Wizard of Wikipedia and Holl-E datasets in automatic and human evaluations.

The capability to generate responses with diversity and faithfulness using factual knowledge is paramount for creating a human-like, trustworthy dialogue system. Common strategies either adopt a two-step paradigm, which optimizes knowledge selection and response generation separately, and may overlook the inherent correlation between these two tasks, or leverage conditional variational method to jointly optimize knowledge selection and response generation by employing an inference network. In this paper, we present an end-to-end learning framework, termed Sequential Posterior Inference (SPI), capable of selecting knowledge and generating dialogues by approximately sampling from the posterior distribution. Unlike other methods, SPI does not require the inference network or assume a simple geometry of the posterior distribution. This straightforward and intuitive inference procedure of SPI directly queries the response generation model, allowing for accurate knowledge selection and generation of faithful responses. In addition to modeling contributions, our experimental results on two common dialogue datasets (Wizard of Wikipedia and Holl-E) demonstrate that SPI outperforms previous strong baselines according to both automatic and human evaluation metrics.

View on arXiv PDF Code

Similar