Goal Alignment in LLM-Based User Simulators for Conversational AI
This addresses a critical gap in conversational AI by enhancing the reliability of user simulators for scalable agent development and evaluation, though it is incremental as it builds on existing LLM capabilities.
The paper tackles the problem of LLM-based user simulators failing to maintain goal-oriented behavior in multi-turn conversations, introducing the User Goal State Tracking (UGST) framework to improve goal alignment, resulting in substantial improvements on benchmarks like MultiWOZ 2.4 and τ-Bench.
User simulators are essential to conversational AI, enabling scalable agent development and evaluation through simulated interactions. While current Large Language Models (LLMs) have advanced user simulation capabilities, we reveal that they struggle to consistently demonstrate goal-oriented behavior across multi-turn conversations--a critical limitation that compromises their reliability in downstream applications. We introduce User Goal State Tracking (UGST), a novel framework that tracks user goal progression throughout conversations. Leveraging UGST, we present a three-stage methodology for developing user simulators that can autonomously track goal progression and reason to generate goal-aligned responses. Moreover, we establish comprehensive evaluation metrics for measuring goal alignment in user simulators, and demonstrate that our approach yields substantial improvements across two benchmarks (MultiWOZ 2.4 and τ-Bench). Our contributions address a critical gap in conversational AI and establish UGST as an essential framework for developing goal-aligned user simulators.