CLApr 23, 2024

Simulating Task-Oriented Dialogues with State Transition Graphs and Large Language Models

Chris Samarinas, Pracha Promthaw, Atharva Nijasure, Hansi Zeng, Julian Killingback, Hamed Zamani

arXiv:2404.14772v14.26 citationsh-index: 6Has Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of quickly developing domain-specific TOD systems without crowdsourcing, offering a practical solution for researchers and developers in dialogue systems.

The paper tackles the problem of developing end-to-end Task-Oriented Dialogue (TOD) systems for complex tasks like intent classification and slot filling by introducing SynTOD, a synthetic data generation approach that uses state transition graphs and large language models to create structured conversations without real-world data, resulting in significant improvements in intent classification, slot filling, and response relevance compared to naive methods.

This paper explores SynTOD, a new synthetic data generation approach for developing end-to-end Task-Oriented Dialogue (TOD) Systems capable of handling complex tasks such as intent classification, slot filling, conversational question-answering, and retrieval-augmented response generation, without relying on crowdsourcing or real-world data. SynTOD utilizes a state transition graph to define the desired behavior of a TOD system and generates diverse, structured conversations through random walks and response simulation using large language models (LLMs). In our experiments, using graph-guided response simulations leads to significant improvements in intent classification, slot filling and response relevance compared to naive single-prompt simulated conversations. We also investigate the end-to-end TOD effectiveness of different base and instruction-tuned LLMs, with and without the constructed synthetic conversations. Finally, we explore how various LLMs can evaluate responses in a TOD system and how well they are correlated with human judgments. Our findings pave the path towards quick development and evaluation of domain-specific TOD systems. We release our datasets, models, and code for research purposes.

View on arXiv PDF Code

Similar