Deep Reinforcement Learning for On-line Dialogue State Tracking
This work addresses the challenge of on-line optimization for dialogue state tracking, which is incremental as it applies an existing DRL framework to a new application area.
The paper tackled the problem of optimizing dialogue state tracking (DST) for on-line task-oriented spoken dialogue systems by proposing a novel deep reinforcement learning framework with companion teaching, resulting in improved dialogue manager performance while maintaining policy flexibility, with joint training further enhancing performance.
Dialogue state tracking (DST) is a crucial module in dialogue management. It is usually cast as a supervised training problem, which is not convenient for on-line optimization. In this paper, a novel companion teaching based deep reinforcement learning (DRL) framework for on-line DST optimization is proposed. To the best of our knowledge, this is the first effort to optimize the DST module within DRL framework for on-line task-oriented spoken dialogue systems. In addition, dialogue policy can be further jointly updated. Experiments show that on-line DST optimization can effectively improve the dialogue manager performance while keeping the flexibility of using predefined policy. Joint training of both DST and policy can further improve the performance.