CookDial: A dataset for task-oriented dialogs grounded in procedural documents
This work addresses the problem of building task-oriented dialog systems that understand procedural documents for researchers in natural language processing, but it is incremental as it focuses on creating a new dataset and baselines.
The authors introduced CookDial, a dataset of 260 human-to-human task-oriented dialogs grounded in recipes, to advance research on dialog systems with procedural knowledge understanding, and they developed neural baseline models for three subtasks evaluated on this dataset.
This work presents a new dialog dataset, CookDial, that facilitates research on task-oriented dialog systems with procedural knowledge understanding. The corpus contains 260 human-to-human task-oriented dialogs in which an agent, given a recipe document, guides the user to cook a dish. Dialogs in CookDial exhibit two unique features: (i) procedural alignment between the dialog flow and supporting document; (ii) complex agent decision-making that involves segmenting long sentences, paraphrasing hard instructions and resolving coreference in the dialog context. In addition, we identify three challenging (sub)tasks in the assumed task-oriented dialog system: (1) User Question Understanding, (2) Agent Action Frame Prediction, and (3) Agent Response Generation. For each of these tasks, we develop a neural baseline model, which we evaluate on the CookDial dataset. We publicly release the CookDial dataset, comprising rich annotations of both dialogs and recipe documents, to stimulate further research on domain-specific document-grounded dialog systems.