CL LGJun 3, 2020

Meta Dialogue Policy Learning

Yumo Xu, Chenguang Zhu, Baolin Peng, Michael Zeng

arXiv:2006.02588v11.18 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of data scarcity in dialogue systems for AI assistants, though it is incremental as it builds on existing transfer and meta-learning methods.

The paper tackles the problem of dialogue policy adaptation to novel domains with limited data by proposing a meta-learning framework with a dual-replay mechanism, achieving higher success rates and dialogue efficiency on the MultiWOZ 2.0 dataset compared to baselines.

Dialog policy determines the next-step actions for agents and hence is central to a dialogue system. However, when migrated to novel domains with little data, a policy model can fail to adapt due to insufficient interactions with the new environment. We propose Deep Transferable Q-Network (DTQN) to utilize shareable low-level signals between domains, such as dialogue acts and slots. We decompose the state and action representation space into feature subspaces corresponding to these low-level components to facilitate cross-domain knowledge transfer. Furthermore, we embed DTQN in a meta-learning framework and introduce Meta-DTQN with a dual-replay mechanism to enable effective off-policy training and adaptation. In experiments, our model outperforms baseline models in terms of both success rate and dialogue efficiency on the multi-domain dialogue dataset MultiWOZ 2.0.

View on arXiv PDF

Similar