CLAIJun 19, 2017

Sub-domain Modelling for Dialogue Management with Hierarchical Reinforcement Learning

arXiv:1706.06210v21113 citations
AI Analysis

This addresses the problem of inefficient policy learning for dialogue systems that span multiple topics, offering an incremental improvement for developers of conversational AI.

The paper tackles the challenge of multi-domain dialogue management by proposing a hierarchical reinforcement learning method using the option framework, which learns faster and achieves better policies than flat methods.

Human conversation is inherently complex, often spanning many different topics/domains. This makes policy learning for dialogue systems very challenging. Standard flat reinforcement learning methods do not provide an efficient framework for modelling such dialogues. In this paper, we focus on the under-explored problem of multi-domain dialogue management. First, we propose a new method for hierarchical reinforcement learning using the option framework. Next, we show that the proposed architecture learns faster and arrives at a better policy than the existing flat ones do. Moreover, we show how pretrained policies can be adapted to more complex systems with an additional set of new actions. In doing that, we show that our approach has the potential to facilitate policy optimisation for more sophisticated multi-domain dialogue systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes