Turning Conversations into Workflows: A Framework to Extract and Evaluate Dialog Workflows for Service AI Agents
This addresses the need for consistent and accurate automated service agents by providing a scalable method to extract undocumented workflows, though it is incremental as it builds on existing techniques like chain-of-thought prompting.
The paper tackles the problem of automatically extracting dialog workflows from historical conversations for service AI agents, presenting a framework that improves workflow extraction by 12.16% in average macro accuracy over baselines.
Automated service agents require well-structured workflows to provide consistent and accurate responses to customer queries. However, these workflows are often undocumented, and their automatic extraction from conversations remains unexplored. In this work, we present a novel framework for extracting and evaluating dialog workflows from historical interactions. Our extraction process consists of two key stages: (1) a retrieval step to select relevant conversations based on key procedural elements, and (2) a structured workflow generation process using a question-answer-based chain-of-thought (QA-CoT) prompting. To comprehensively assess the quality of extracted workflows, we introduce an automated agent and customer bots simulation framework that measures their effectiveness in resolving customer issues. Extensive experiments on the ABCD and SynthABCD datasets demonstrate that our QA-CoT technique improves workflow extraction by 12.16\% in average macro accuracy over the baseline. Moreover, our evaluation method closely aligns with human assessments, providing a reliable and scalable framework for future research.