CLJun 30, 2016

A Sequence-to-Sequence Model for User Simulation in Spoken Dialogue Systems

arXiv:1607.00070v118.0128 citations

Originality Incremental advance

AI Analysis

This work addresses the need for more flexible and accurate user simulation in spoken dialogue systems, offering a data-driven approach that improves over previous methods, though it is incremental in nature.

The paper tackles the problem of generating realistic user behavior for training spoken dialogue systems by introducing a sequence-to-sequence recurrent neural network model that outputs user intentions as dialogue acts. It shows that this model outperforms existing agenda-based and n-gram simulators on the DSTC2 dataset, achieving higher F-scores, and can operate on the original action space for finer granularity.

User simulation is essential for generating enough data to train a statistical spoken dialogue system. Previous models for user simulation suffer from several drawbacks, such as the inability to take dialogue history into account, the need of rigid structure to ensure coherent user behaviour, heavy dependence on a specific domain, the inability to output several user intentions during one dialogue turn, or the requirement of a summarized action space for tractability. This paper introduces a data-driven user simulator based on an encoder-decoder recurrent neural network. The model takes as input a sequence of dialogue contexts and outputs a sequence of dialogue acts corresponding to user intentions. The dialogue contexts include information about the machine acts and the status of the user goal. We show on the Dialogue State Tracking Challenge 2 (DSTC2) dataset that the sequence-to-sequence model outperforms an agenda-based simulator and an n-gram simulator, according to F-score. Furthermore, we show how this model can be used on the original action space and thereby models user behaviour with finer granularity.

View on arXiv PDF

Similar