CLLGMLNov 9, 2019

Conditioned Query Generation for Task-Oriented Dialogue Systems

arXiv:1911.03698v11 citations
Originality Incremental advance
AI Analysis

This addresses data scarcity for developers of closed-domain dialogue systems, offering a cheap and fast augmentation alternative, though it is incremental as it builds on existing generation techniques.

The paper tackles the scarcity of training data for task-oriented dialogue systems by proposing a controlled data generation method using a conditional variational autoencoder and query transfer protocol, showing that it consistently improves query diversity without compromising quality in appropriate regimes.

Scarcity of training data for task-oriented dialogue systems is a well known problem that is usually tackled with costly and time-consuming manual data annotation. An alternative solution is to rely on automatic text generation which, although less accurate than human supervision, has the advantage of being cheap and fast. In this paper we propose a novel controlled data generation method that could be used as a training augmentation framework for closed-domain dialogue. Our contribution is twofold. First we show how to optimally train and control the generation of intent-specific sentences using a conditional variational autoencoder. Then we introduce a novel protocol called query transfer that allows to leverage a broad, unlabelled dataset to extract relevant information. Comparison with two different baselines shows that our method, in the appropriate regime, consistently improves the diversity of the generated queries without compromising their quality.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes