CLOct 23, 2016

Two are Better than One: An Ensemble of Retrieval- and Generation-Based Dialog Systems

arXiv:1610.07149v1114 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of producing more meaningful responses in open-domain human-computer conversation, though it appears incremental as it combines existing approaches.

The paper tackled the problem of generating short, meaningless utterances in open-domain dialog systems by proposing an ensemble of retrieval- and generation-based methods, resulting in a large margin of performance improvement over individual components.

Open-domain human-computer conversation has attracted much attention in the field of NLP. Contrary to rule- or template-based domain-specific dialog systems, open-domain conversation usually requires data-driven approaches, which can be roughly divided into two categories: retrieval-based and generation-based systems. Retrieval systems search a user-issued utterance (called a query) in a large database, and return a reply that best matches the query. Generative approaches, typically based on recurrent neural networks (RNNs), can synthesize new replies, but they suffer from the problem of generating short, meaningless utterances. In this paper, we propose a novel ensemble of retrieval-based and generation-based dialog systems in the open domain. In our approach, the retrieved candidate, in addition to the original query, is fed to an RNN-based reply generator, so that the neural model is aware of more information. The generated reply is then fed back as a new candidate for post-reranking. Experimental results show that such ensemble outperforms each single part of it by a large margin.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes