A Self-Attention Joint Model for Spoken Language Understanding in Situational Dialog Applications
This addresses the need for more integrated SLU methods in goal-oriented dialog systems, with an incremental improvement for educational applications.
The paper tackles the problem of spoken language understanding in dialog systems by proposing a multi-head self-attention joint model with CRF and prior mask to jointly handle intent detection and slot filling, showing effectiveness compared to state-of-the-art models. It also applies this to an intelligent dialog robot for foreign language learning in online education in China.
Spoken language understanding (SLU) acts as a critical component in goal-oriented dialog systems. It typically involves identifying the speakers intent and extracting semantic slots from user utterances, which are known as intent detection (ID) and slot filling (SF). SLU problem has been intensively investigated in recent years. However, these methods just constrain SF results grammatically, solve ID and SF independently, or do not fully utilize the mutual impact of the two tasks. This paper proposes a multi-head self-attention joint model with a conditional random field (CRF) layer and a prior mask. The experiments show the effectiveness of our model, as compared with state-of-the-art models. Meanwhile, online education in China has made great progress in the last few years. But there are few intelligent educational dialog applications for students to learn foreign languages. Hence, we design an intelligent dialog robot equipped with different scenario settings to help students learn communication skills.