Structure Extraction in Task-Oriented Dialogues with Slot Clustering
This work addresses the challenge of automating structure extraction for task-oriented dialogues, which is incremental as it builds on existing methods for slot clustering and state tracking.
The paper tackles the problem of expensive manual annotation for dialogue structure extraction in task-oriented dialogues by proposing an approach that clusters slot tokens and tracks their status to derive state transitions, resulting in outperforming unsupervised baselines and boosting dialogue response generation through data augmentation.
Extracting structure information from dialogue data can help us better understand user and system behaviors. In task-oriented dialogues, dialogue structure has often been considered as transition graphs among dialogue states. However, annotating dialogue states manually is expensive and time-consuming. In this paper, we propose a simple yet effective approach for structure extraction in task-oriented dialogues. We first detect and cluster possible slot tokens with a pre-trained model to approximate dialogue ontology for a target domain. Then we track the status of each identified token group and derive a state transition structure. Empirical results show that our approach outperforms unsupervised baseline models by far in dialogue structure extraction. In addition, we show that data augmentation based on extracted structures enriches the surface formats of training data and can achieve a significant performance boost in dialogue response generation.