CL LG MLNov 9, 2019

Conditioned Query Generation for Task-Oriented Dialogue Systems

Stéphane d'Ascoli, Alice Coucke, Francesco Caltagirone, Alexandre Caulier, Marc Lelarge

arXiv:1911.03698v10.31 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses data scarcity for developers of closed-domain dialogue systems, offering a cheap and fast augmentation alternative, though it is incremental as it builds on existing generation techniques.

The paper tackles the scarcity of training data for task-oriented dialogue systems by proposing a controlled data generation method using a conditional variational autoencoder and query transfer protocol, showing that it consistently improves query diversity without compromising quality in appropriate regimes.

Scarcity of training data for task-oriented dialogue systems is a well known problem that is usually tackled with costly and time-consuming manual data annotation. An alternative solution is to rely on automatic text generation which, although less accurate than human supervision, has the advantage of being cheap and fast. In this paper we propose a novel controlled data generation method that could be used as a training augmentation framework for closed-domain dialogue. Our contribution is twofold. First we show how to optimally train and control the generation of intent-specific sentences using a conditional variational autoencoder. Then we introduce a novel protocol called query transfer that allows to leverage a broad, unlabelled dataset to extract relevant information. Comparison with two different baselines shows that our method, in the appropriate regime, consistently improves the diversity of the generated queries without compromising their quality.

View on arXiv PDF Code

Similar