CLAIMar 7, 2024

ECLM: Entity Level Language Model for Spoken Language Understanding with Chain of Intent

arXiv:2403.04481v414 citationsh-index: 5Has CodeACL
AI Analysis

This work addresses misalignment and interrelation issues in spoken language understanding for applications like voice assistants, representing an incremental improvement with specific gains.

The paper tackles the challenge of applying large language models to spoken language understanding by proposing the ECLM framework, which reformulates slot-filling as entity recognition and introduces Chain of Intent for multi-intent recognition, resulting in gains of 3.7% on MixATIS and 3.1% on MixSNIPS over strong baselines.

Large Language Models (LLMs) have demonstrated impressive capabilities in language generation and general task performance. However, their application to spoken language understanding (SLU) remains challenging, particularly for token-level tasks, where the autoregressive nature of LLMs often leads to misalignment issues. They also struggle to capture nuanced interrelations in semantic-level tasks through direct fine-tuning alone. To address these challenges, we propose the Entity-level Language Model (ECLM) framework, which reformulates slot-filling as an entity recognition task and introduces a novel concept, \textit{Chain of Intent}, to enable step-by-step multi-intent recognition. Experimental results show that ECLM significantly outperforms strong baselines such as Uni-MIS, achieving gains of 3.7\% on MixATIS and 3.1\% on MixSNIPS. Compared to standard supervised fine-tuning of LLMs, ECLM further achieves improvements of 8.5\% and 21.2\% on these datasets, respectively. Our code is available at https://github.com/SJY8460/ECLM.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes