PyTOD: Programmable Task-Oriented Dialogue with Execution Feedback
This work addresses a key bottleneck in task-oriented dialogue systems for improving human-computer interaction, representing a novel method rather than an incremental improvement.
The researchers tackled the problem of accurate state tracking in programmable task-oriented dialogue agents by developing PyTOD, which generates executable code for state tracking and uses execution feedback for error correction. This approach achieved state-of-the-art performance on the SGD benchmark, surpassing baselines in accuracy and robust user goal estimation.
Programmable task-oriented dialogue (TOD) agents enable language models to follow structured dialogue policies, but their effectiveness hinges on accurate state tracking. We present PyTOD, an agent that generates executable code to track dialogue state and uses policy and execution feedback for efficient error correction. To this end, PyTOD employs a simple constrained decoding approach, using a language model instead of grammar rules to follow API schemata. This leads to state-of-the-art state tracking performance on the challenging SGD benchmark. Our experiments show that PyTOD surpasses strong baselines in both accuracy and robust user goal estimation as the dialogue progresses, demonstrating the effectiveness of execution-aware state tracking.