CL AI NEMar 8, 2018

Feudal Reinforcement Learning for Dialogue Management in Large Domains

Iñigo Casanueva, Paweł Budzianowski, Pei-Hao Su, Stefan Ultes, Lina Rojas-Barahona, Bo-Hsiang Tseng, Milica Gašić

arXiv:1803.03232v132.21107 citations

Originality Highly original

AI Analysis

This work addresses the scalability issue in dialogue management for large domains, which is a key problem for developers of conversational AI systems, and it represents a novel method rather than an incremental improvement.

The paper tackles the scalability problem of reinforcement learning for dialogue management in large domains by proposing a feudal RL architecture that decomposes decisions into two steps, using domain ontology for state abstraction and information sharing. The result shows that an implementation based on Deep-Q Networks significantly outperforms previous state-of-the-art methods in several dialogue domains and environments without needing additional reward signals.

Reinforcement learning (RL) is a promising approach to solve dialogue policy optimisation. Traditional RL algorithms, however, fail to scale to large domains due to the curse of dimensionality. We propose a novel Dialogue Management architecture, based on Feudal RL, which decomposes the decision into two steps; a first step where a master policy selects a subset of primitive actions, and a second step where a primitive action is chosen from the selected subset. The structural information included in the domain ontology is used to abstract the dialogue state space, taking the decisions at each step using different parts of the abstracted state. This, combined with an information sharing mechanism between slots, increases the scalability to large domains. We show that an implementation of this approach, based on Deep-Q Networks, significantly outperforms previous state of the art in several dialogue domains and environments, without the need of any additional reward signal.

View on arXiv PDF

Similar