The Dialogue Dodecathlon: Open-Domain Knowledge and Image Grounded Conversational Agents
This work addresses the challenge of creating unified conversational agents for open-domain human interaction, though it is incremental as it builds on existing multi-tasking and pre-training methods.
The paper introduces dodecaDialogue, a set of 12 tasks to measure conversational agents' abilities in open-domain dialogue, and shows that multi-tasking improves performance over a BERT baseline, achieving state-of-the-art results on many tasks.
We introduce dodecaDialogue: a set of 12 tasks that measures if a conversational agent can communicate engagingly with personality and empathy, ask questions, answer questions by utilizing knowledge resources, discuss topics and situations, and perceive and converse about images. By multi-tasking on such a broad large-scale set of data, we hope to both move towards and measure progress in producing a single unified agent that can perceive, reason and converse with humans in an open-domain setting. We show that such multi-tasking improves over a BERT pre-trained baseline, largely due to multi-tasking with very large dialogue datasets in a similar domain, and that the multi-tasking in general provides gains to both text and image-based tasks using several metrics in both the fine-tune and task transfer settings. We obtain state-of-the-art results on many of the tasks, providing a strong baseline for this challenge.