CLFeb 28, 2024

A BiRGAT Model for Multi-intent Spoken Language Understanding with Hierarchical Semantic Frames

Hongshen Xu, Ruisheng Cao, Su Zhu, Sheng Jiang, Hanchong Zhang, Lu Chen, Kai Yu

arXiv:2402.18258v12.76 citationsh-index: 16Has CodeICASSP

Originality Incremental advance

AI Analysis

This addresses the limitation of single-intent SLU for realistic multi-intent scenarios in applications like in-vehicle systems, though it appears incremental as it builds on existing graph attention and pointer-generator techniques.

The paper tackles the problem of multi-intent spoken language understanding, which previous work limited to single-intent settings, by proposing a BiRGAT model that encodes hierarchical semantic frames; it outperforms traditional methods by a large margin on a new dataset collected from an in-vehicle dialogue system.

Previous work on spoken language understanding (SLU) mainly focuses on single-intent settings, where each input utterance merely contains one user intent. This configuration significantly limits the surface form of user utterances and the capacity of output semantics. In this work, we first propose a Multi-Intent dataset which is collected from a realistic in-Vehicle dialogue System, called MIVS. The target semantic frame is organized in a 3-layer hierarchical structure to tackle the alignment and assignment problems in multi-intent cases. Accordingly, we devise a BiRGAT model to encode the hierarchy of ontology items, the backbone of which is a dual relational graph attention network. Coupled with the 3-way pointer-generator decoder, our method outperforms traditional sequence labeling and classification-based schemes by a large margin.

View on arXiv PDF Code

Similar