SDAICLASOct 19, 2025

SAKE: Towards Editing Auditory Attribute Knowledge of Large Audio-Language Models

arXiv:2510.16917v14 citationsh-index: 11
Originality Incremental advance
AI Analysis

This work addresses the need for efficient knowledge updates in LALMs for real-world applications, but it is incremental as it extends existing knowledge editing techniques from textual/visual to auditory modalities.

The paper tackles the problem of updating knowledge in Large Audio-Language Models (LALMs) without full retraining by introducing SAKE, the first benchmark for editing auditory attribute knowledge, and benchmarks seven editing methods on two LALMs across four dimensions, revealing challenges like preserving unrelated intra-attribute knowledge and generalizing edits to multimodal reasoning.

Knowledge editing offers an efficient way to update model knowledge without full retraining, but prior work has concentrated almost exclusively on textual or visual modalities. We introduce SAKE, the first benchmark specifically designed for editing auditory attribute knowledge in Large Audio-Language Models (LALMs). Unlike factual updates, SAKE targets several abstract auditory attributes, capturing knowledge types that go beyond conventional textual and visual domains. We benchmark seven editing methods on two LALMs along four dimensions: reliability, generality, audio/text locality, and portability. Results highlight challenges such as preserving intra-attribute knowledge unrelated to the edit, generalizing edits to multimodal reasoning, and maintaining edits under sequential updates. SAKE provides a principled framework to study how knowledge editing extends to the auditory modalities, opening new directions for maintaining and adapting LALMs in more diverse real-world scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes