Joint On-line Learning of a Zero-shot Spoken Semantic Parser and a Reinforcement Learning Dialogue Manager
This addresses the challenge of data collection for dialogue systems, enabling more efficient training with minimal user interaction, though it is incremental as it builds on prior on-line learning works.
The paper tackles the data acquisition bottleneck in dialogue systems by proposing an on-line learning approach that jointly trains a zero-shot spoken semantic parser and a reinforcement learning dialogue manager, achieving performance surpassing an expert-based system after only a few hundred training dialogues.
Despite many recent advances for the design of dialogue systems, a true bottleneck remains the acquisition of data required to train its components. Unlike many other language processing applications, dialogue systems require interactions with users, therefore it is complex to develop them with pre-recorded data. Building on previous works, on-line learning is pursued here as a most convenient way to address the issue. Data collection, annotation and use in learning algorithms are performed in a single process. The main difficulties are then: to bootstrap an initial basic system, and to control the level of additional cost on the user side. Considering that well-performing solutions can be used directly off the shelf for speech recognition and synthesis, the study is focused on learning the spoken language understanding and dialogue management modules only. Several variants of joint learning are investigated and tested with user trials to confirm that the overall on-line learning can be obtained after only a few hundred training dialogues and can overstep an expert-based system.