CLAIHCMay 4, 2023

An Asynchronous Updating Reinforcement Learning Framework for Task-oriented Dialog System

arXiv:2305.02718v1Has Code
Originality Incremental advance
AI Analysis

This work addresses training instability in modular dialog systems, which is an incremental improvement for developers of conversational AI.

The authors tackled the problem of mutual interference between dialog state tracking (DST) and dialog policy (DP) modules during reinforcement learning training in task-oriented dialog systems, proposing an asynchronous updating framework (AURL) that achieved a 31.37% improvement in dialog success rate on the SSD-PHONE dataset.

Reinforcement learning has been applied to train the dialog systems in many works. Previous approaches divide the dialog system into multiple modules including DST (dialog state tracking) and DP (dialog policy), and train these modules simultaneously. However, different modules influence each other during training. The errors from DST might misguide the dialog policy, and the system action brings extra difficulties for the DST module. To alleviate this problem, we propose Asynchronous Updating Reinforcement Learning framework (AURL) that updates the DST module and the DP module asynchronously under a cooperative setting. Furthermore, curriculum learning is implemented to address the problem of unbalanced data distribution during reinforcement learning sampling, and multiple user models are introduced to increase the dialog diversity. Results on the public SSD-PHONE dataset show that our method achieves a compelling result with a 31.37% improvement on the dialog success rate. The code is publicly available via https://github.com/shunjiu/AURL.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes