CLDec 20, 2022

Dialog2API: Task-Oriented Dialogue with API Description and Example Programs

Raphael Shu, Elman Mansimov, Tamer Alkhouli, Nikolaos Pappas, Salvatore Romeo, Arshit Gupta, Saab Mansour, Yi Zhang, Dan Roth

arXiv:2212.09946v12.610 citationsh-index: 98

Originality Highly original

AI Analysis

This addresses the problem of constrained functionality and dialogue experience in task-oriented systems for applications like software automation and customer service, representing a novel paradigm rather than an incremental improvement.

The authors tackled the limitations of conventional task-oriented dialogue systems by introducing Dialog2API, a new paradigm that uses program generation and execution to interact with APIs, enabling composite goals and robust dialogue experiences. They constructed a dataset for AWS S3 APIs and evaluated in-context learning baselines, showing improved functionality and seamless interactions.

Functionality and dialogue experience are two important factors of task-oriented dialogue systems. Conventional approaches with closed schema (e.g., conversational semantic parsing) often fail as both the functionality and dialogue experience are strongly constrained by the underlying schema. We introduce a new paradigm for task-oriented dialogue - Dialog2API - to greatly expand the functionality and provide seamless dialogue experience. The conversational model interacts with the environment by generating and executing programs triggering a set of pre-defined APIs. The model also manages the dialogue policy and interact with the user through generating appropriate natural language responses. By allowing generating free-form programs, Dialog2API supports composite goals by combining different APIs, whereas unrestricted program revision provides natural and robust dialogue experience. To facilitate Dialog2API, the core model is provided with API documents, an execution environment and optionally some example dialogues annotated with programs. We propose an approach tailored for the Dialog2API, where the dialogue states are represented by a stack of programs, with most recently mentioned program on the top of the stack. Dialog2API can work with many application scenarios such as software automation and customer service. In this paper, we construct a dataset for AWS S3 APIs and present evaluation results of in-context learning baselines.

View on arXiv PDF

Similar