CLMMDec 1, 2025

MAC-SLU: Multi-Intent Automotive Cabin Spoken Language Understanding Benchmark

arXiv:2512.01603v12 citationsh-index: 8Has Code
Originality Synthesis-oriented
AI Analysis

This work addresses the need for a unified benchmark for SLU in automotive cabins, but it is incremental as it builds on existing methods with a new dataset.

The authors tackled the lack of diverse and complex datasets for Spoken Language Understanding (SLU) by introducing MAC-SLU, a multi-intent automotive cabin dataset, and benchmarked LLMs and LALMs, finding that supervised fine-tuning outperforms in-context learning and end-to-end LALMs match pipeline approaches while avoiding error propagation.

Spoken Language Understanding (SLU), which aims to extract user semantics to execute downstream tasks, is a crucial component of task-oriented dialog systems. Existing SLU datasets generally lack sufficient diversity and complexity, and there is an absence of a unified benchmark for the latest Large Language Models (LLMs) and Large Audio Language Models (LALMs). This work introduces MAC-SLU, a novel Multi-Intent Automotive Cabin Spoken Language Understanding Dataset, which increases the difficulty of the SLU task by incorporating authentic and complex multi-intent data. Based on MAC-SLU, we conducted a comprehensive benchmark of leading open-source LLMs and LALMs, covering methods like in-context learning, supervised fine-tuning (SFT), and end-to-end (E2E) and pipeline paradigms. Our experiments show that while LLMs and LALMs have the potential to complete SLU tasks through in-context learning, their performance still lags significantly behind SFT. Meanwhile, E2E LALMs demonstrate performance comparable to pipeline approaches and effectively avoid error propagation from speech recognition. Code\footnote{https://github.com/Gatsby-web/MAC\_SLU} and datasets\footnote{huggingface.co/datasets/Gatsby1984/MAC\_SLU} are released publicly.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes