AFD-SLU: Adaptive Feature Distillation for Spoken Language Understanding
This work addresses data scarcity and computational efficiency for conversational systems, but it is incremental as it builds on existing distillation and adaptation techniques.
The paper tackles the challenge of developing effective Spoken Language Understanding (SLU) systems by addressing data scarcity and computational burdens, proposing an Adaptive Feature Distillation framework that achieves state-of-the-art results with 95.67% intent accuracy, 92.02% slot F1 score, and 85.50% overall accuracy on a benchmark.
Spoken Language Understanding (SLU) is a core component of conversational systems, enabling machines to interpret user utterances. Despite its importance, developing effective SLU systems remains challenging due to the scarcity of labeled training data and the computational burden of deploying Large Language Models (LLMs) in real-world applications. To further alleviate these issues, we propose an Adaptive Feature Distillation framework that transfers rich semantic representations from a General Text Embeddings (GTE)-based teacher model to a lightweight student model. Our method introduces a dynamic adapter equipped with a Residual Projection Neural Network (RPNN) to align heterogeneous feature spaces, and a Dynamic Distillation Coefficient (DDC) that adaptively modulates the distillation strength based on real-time feedback from intent and slot prediction performance. Experiments on the Chinese profile-based ProSLU benchmark demonstrate that AFD-SLU achieves state-of-the-art results, with 95.67% intent accuracy, 92.02% slot F1 score, and 85.50% overall accuracy.