CVAILGMay 14

ICED: Concept-level Machine Unlearning via Interpretable Concept Decomposition

arXiv:2605.1430964.4
Predicted impact top 52% in CV · last 90 daysOriginality Incremental advance
AI Analysis

For practitioners needing fine-grained control over what VLMs forget, this work provides a method to remove specific concepts without affecting unrelated image content, addressing a key limitation of instance-level unlearning.

The paper tackles the problem of imprecise knowledge removal in Vision-Language Models (VLMs) when unlearning at the instance level. It proposes ICED, a framework that decomposes visual representations into interpretable concepts, enabling selective suppression of target concepts while preserving non-target semantics, achieving more comprehensive forgetting and better model utility than existing methods.

Machine unlearning in Vision-Language Models (VLMs) is typically performed at the image or instance level, making it difficult to precisely remove target knowledge without affecting unrelated semantics. This issue is especially pronounced since a single image often contains multiple entangled concepts, including both target concepts to be forgotten and contextual information that should be preserved. In this paper, we propose an interpretable concept-level unlearning framework for VLMs, which constructs a compact task-specific concept vocabulary from the forgetting set using a multimodal large language model. In addition to modality alignment, visual representations are decomposed into sparse, nonnegative combinations of semantic concepts, providing an explicit interface for fine-grained knowledge manipulation. Based on this decomposition, our method formulates unlearning as concept-level optimization, where target concepts are selectively suppressed while intra-instance non-target semantics and global cross-modal knowledge are preserved. Extensive experiments across both in-domain and out-of-domain forgetting settings demonstrate that our method enables more comprehensive target forgetting, better preserves non-target knowledge within the same image, and maintains competitive model utility compared with existing VLM unlearning methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes