Iterative Delexicalization for Improved Spoken Language Understanding
This addresses a key bottleneck for building robust spoken language and dialog systems, particularly in handling out-of-distribution slot values, though it is incremental as it builds on existing delexicalization techniques.
The paper tackles the problem of poor performance in RNN-based spoken language understanding models for slots with large semantic variability, especially out-of-vocabulary values, by proposing an iterative delexicalization algorithm that improves parsing performance, showing significant gains on benchmark and in-house datasets.
Recurrent neural network (RNN) based joint intent classification and slot tagging models have achieved tremendous success in recent years for building spoken language understanding and dialog systems. However, these models suffer from poor performance for slots which often encounter large semantic variability in slot values after deployment (e.g. message texts, partial movie/artist names). While greedy delexicalization of slots in the input utterance via substring matching can partly improve performance, it often produces incorrect input. Moreover, such techniques cannot delexicalize slots with out-of-vocabulary slot values not seen at training. In this paper, we propose a novel iterative delexicalization algorithm, which can accurately delexicalize the input, even with out-of-vocabulary slot values. Based on model confidence of the current delexicalized input, our algorithm improves delexicalization in every iteration to converge to the best input having the highest confidence. We show on benchmark and in-house datasets that our algorithm can greatly improve parsing performance for RNN based models, especially for out-of-distribution slot values.