CLLGJun 3, 2020

Meta Dialogue Policy Learning

arXiv:2006.02588v18 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of data scarcity in dialogue systems for AI assistants, though it is incremental as it builds on existing transfer and meta-learning methods.

The paper tackles the problem of dialogue policy adaptation to novel domains with limited data by proposing a meta-learning framework with a dual-replay mechanism, achieving higher success rates and dialogue efficiency on the MultiWOZ 2.0 dataset compared to baselines.

Dialog policy determines the next-step actions for agents and hence is central to a dialogue system. However, when migrated to novel domains with little data, a policy model can fail to adapt due to insufficient interactions with the new environment. We propose Deep Transferable Q-Network (DTQN) to utilize shareable low-level signals between domains, such as dialogue acts and slots. We decompose the state and action representation space into feature subspaces corresponding to these low-level components to facilitate cross-domain knowledge transfer. Furthermore, we embed DTQN in a meta-learning framework and introduce Meta-DTQN with a dual-replay mechanism to enable effective off-policy training and adaptation. In experiments, our model outperforms baseline models in terms of both success rate and dialogue efficiency on the multi-domain dialogue dataset MultiWOZ 2.0.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes