Encoding Word Confusion Networks with Recurrent Neural Networks for Dialog State Tracking
This addresses the challenge of handling speech recognition errors for more accurate dialog state tracking in spoken dialog systems, though it is an incremental improvement.
The paper tackles the problem of dialog state tracking in spoken dialog systems by encoding word confusion networks from automatic speech recognition output with recurrent neural networks, resulting in improved performance over using only the best hypothesis on the second Dialog State Tracking Challenge dataset.
This paper presents our novel method to encode word confusion networks, which can represent a rich hypothesis space of automatic speech recognition systems, via recurrent neural networks. We demonstrate the utility of our approach for the task of dialog state tracking in spoken dialog systems that relies on automatic speech recognition output. Encoding confusion networks outperforms encoding the best hypothesis of the automatic speech recognition in a neural system for dialog state tracking on the well-known second Dialog State Tracking Challenge dataset.