CLFeb 2, 2021

MultiTalk: A Highly-Branching Dialog Testbed for Diverse Conversations

arXiv:2102.01263v18 citations
AI Analysis

This work addresses the need for diverse dialog generation in AI, though it appears incremental as it builds on existing datasets and methods.

The authors tackled the problem of conversational dialog with many possible responses by introducing the MultiTalk Dataset, a corpus of over 320,000 sentences with a high branching factor of 10 and 6 turns, and proposed a scoring algorithm based on bipartite graph matching to evaluate diverse generations, culminating in a theory of mind task.

We study conversational dialog in which there are many possible responses to a given history. We present the MultiTalk Dataset, a corpus of over 320,000 sentences of written conversational dialog that balances a high branching factor (10) with several conversation turns (6) through selective branch continuation. We make multiple contributions to study dialog generation in the highly branching setting. In order to evaluate a diverse set of generations, we propose a simple scoring algorithm, based on bipartite graph matching, to optimally incorporate a set of diverse references. We study multiple language generation tasks at different levels of predictive conversation depth, using textual attributes induced automatically from pretrained classifiers. Our culminating task is a challenging theory of mind problem, a controllable generation task which requires reasoning about the expected reaction of the listener.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes