CREDIT: Coarse-to-Fine Sequence Generation for Dialogue State Tracking
This work addresses the problem of accurately tracking dialogue states in conversational AI systems, offering a generative method that does not rely on pre-defined ontologies, though it appears incremental as it builds on existing generative approaches.
The paper tackles dialogue state tracking by reformulating it as a sequence generation problem using a structured state representation, and proposes the CREDIT approach, which achieves encouraging joint goal accuracy on MultiWOZ 2.0 and 2.1 datasets.
In dialogue systems, a dialogue state tracker aims to accurately find a compact representation of the current dialogue status, based on the entire dialogue history. While previous approaches often define dialogue states as a combination of separate triples ({\em domain-slot-value}), in this paper, we employ a structured state representation and cast dialogue state tracking as a sequence generation problem. Based on this new formulation, we propose a {\bf C}oa{\bf R}s{\bf E}-to-fine {\bf DI}alogue state {\bf T}racking ({\bf CREDIT}) approach. Taking advantage of the structured state representation, which is a marked language sequence, we can further fine-tune the pre-trained model (by supervised learning) by optimizing natural language metrics with the policy gradient method. Like all generative state tracking methods, CREDIT does not rely on pre-defined dialogue ontology enumerating all possible slot values. Experiments demonstrate our tracker achieves encouraging joint goal accuracy for the five domains in MultiWOZ 2.0 and MultiWOZ 2.1 datasets.