Intent Induction from Conversations for Task-Oriented Dialogue Track at DSTC 11
This work provides a standardized benchmark for researchers and developers in task-oriented dialogue systems, facilitating progress tracking and system comparisons, though it is incremental as it builds on existing intent induction efforts.
The paper introduces a benchmark for evaluating automatic intent induction from customer service conversations, addressing the lack of standardized evaluation in this area, and reports results from submissions by 34 teams.
With increasing demand for and adoption of virtual assistants, recent work has investigated ways to accelerate bot schema design through the automatic induction of intents or the induction of slots and dialogue states. However, a lack of dedicated benchmarks and standardized evaluation has made progress difficult to track and comparisons between systems difficult to make. This challenge track, held as part of the Eleventh Dialog Systems Technology Challenge, introduces a benchmark that aims to evaluate methods for the automatic induction of customer intents in a realistic setting of customer service interactions between human agents and customers. We propose two subtasks for progressively tackling the automatic induction of intents and corresponding evaluation methodologies. We then present three datasets suitable for evaluating the tasks and propose simple baselines. Finally, we summarize the submissions and results of the challenge track, for which we received submissions from 34 teams.