CL AIMay 19, 2019

Learning to Memorize in Neural Task-Oriented Dialogue Systems

arXiv:1905.07687v12 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of improving memory and adaptability in dialogue systems for developers and users, though it appears incremental as it builds on existing mechanisms like copy and memory networks.

The thesis tackles the challenge of neural task-oriented dialogue learning by leveraging neural copy mechanisms and memory-augmented neural networks, achieving good performance in multi-domain dialogue state tracking, retrieval-based systems, and generation-based systems, with GLMP reaching state-of-the-art in human evaluation.

In this thesis, we leverage the neural copy mechanism and memory-augmented neural networks (MANNs) to address existing challenge of neural task-oriented dialogue learning. We show the effectiveness of our strategy by achieving good performance in multi-domain dialogue state tracking, retrieval-based dialogue systems, and generation-based dialogue systems. We first propose a transferable dialogue state generator (TRADE) that leverages its copy mechanism to get rid of dialogue ontology and share knowledge between domains. We also evaluate unseen domain dialogue state tracking and show that TRADE enables zero-shot dialogue state tracking and can adapt to new few-shot domains without forgetting the previous domains. Second, we utilize MANNs to improve retrieval-based dialogue learning. They are able to capture dialogue sequential dependencies and memorize long-term information. We also propose a recorded delexicalization copy strategy to replace real entity values with ordered entity types. Our models are shown to surpass other retrieval baselines, especially when the conversation has a large number of turns. Lastly, we tackle generation-based dialogue learning with two proposed models, the memory-to-sequence (Mem2Seq) and global-to-local memory pointer network (GLMP). Mem2Seq is the first model to combine multi-hop memory attention with the idea of the copy mechanism. GLMP further introduces the concept of response sketching and double pointers copying. We show that GLMP achieves the state-of-the-art performance on human evaluation.

View on arXiv PDF

Similar