CLAILGNov 16, 2022

Reflect, Not Reflex: Inference-Based Common Ground Improves Dialogue Response Quality

AI2
arXiv:2211.09267v1301 citationsh-index: 42
Originality Incremental advance
AI Analysis

This work addresses the issue of poor response quality in dialogue systems for users, but it is incremental as it builds on existing methods by adding common ground annotations.

The paper tackled the problem of generic and dull responses in dialogue generation by introducing a dataset with explicit common ground annotations, showing that models trained on current data produce low-quality responses while their new dataset yields high-quality ones, and that prompting GPT-3 with common ground improves response quality by 30%.

Human communication relies on common ground (CG), the mutual knowledge and beliefs shared by participants, to produce coherent and interesting conversations. In this paper, we demonstrate that current response generation (RG) models produce generic and dull responses in dialogues because they act reflexively, failing to explicitly model CG, both due to the lack of CG in training data and the standard RG training procedure. We introduce Reflect, a dataset that annotates dialogues with explicit CG (materialized as inferences approximating shared knowledge and beliefs) and solicits 9k diverse human-generated responses each following one common ground. Using Reflect, we showcase the limitations of current dialogue data and RG models: less than half of the responses in current data are rated as high quality (sensible, specific, and interesting) and models trained using this data have even lower quality, while most Reflect responses are judged high quality. Next, we analyze whether CG can help models produce better-quality responses by using Reflect CG to guide RG models. Surprisingly, we find that simply prompting GPT3 to "think" about CG generates 30% more quality responses, showing promising benefits to integrating CG into the RG process.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes