CLAIJan 21, 2025

Leveraging Graph Structures and Large Language Models for End-to-End Synthetic Task-Oriented Dialogues

arXiv:2501.11977v1h-index: 1AAMAS
Originality Incremental advance
AI Analysis

This work addresses the problem of dataset creation for task-oriented dialogues, making it more accessible for non-technical users, though it is incremental as it builds on existing LLM-based synthetic data generation methods.

The paper tackles the high cost and complexity of training task-oriented dialogue systems by introducing GraphTOD, an end-to-end framework that simplifies synthetic data generation using transition graphs in JSON format, resulting in high-quality dialogues across domains and significantly lowering dataset creation costs.

Training task-oriented dialogue systems is both costly and time-consuming, due to the need for high-quality datasets encompassing diverse intents. Traditional methods depend on extensive human annotation, while recent advancements leverage large language models (LLMs) to generate synthetic data. However, these approaches often require custom prompts or code, limiting accessibility for non-technical users. We introduce GraphTOD, an end-to-end framework that simplifies the generation of task-oriented dialogues. Users can create dialogues by specifying transition graphs in JSON format. Our evaluation demonstrates that GraphTOD generates high-quality dialogues across various domains, significantly lowering the cost and complexity of dataset creation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes