Variational Cross-domain Natural Language Generation for Spoken Dialogue Systems
This work addresses the challenge of balancing diversity and information retention in natural language generation for spoken dialogue systems, representing an incremental improvement over existing methods.
The paper tackles the problem of generating diverse and accurate sentences in cross-domain spoken dialogue systems by introducing a conditional variational autoencoder to incorporate latent sentence-level information, resulting in improved performance over RNN-based generators, especially with limited training data.
Cross-domain natural language generation (NLG) is still a difficult task within spoken dialogue modelling. Given a semantic representation provided by the dialogue manager, the language generator should generate sentences that convey desired information. Traditional template-based generators can produce sentences with all necessary information, but these sentences are not sufficiently diverse. With RNN-based models, the diversity of the generated sentences can be high, however, in the process some information is lost. In this work, we improve an RNN-based generator by considering latent information at the sentence level during generation using the conditional variational autoencoder architecture. We demonstrate that our model outperforms the original RNN-based generator, while yielding highly diverse sentences. In addition, our model performs better when the training data is limited.