CV AIMay 26

Visual-Noise Guided In-Context Distillation for Multimodal Large Language Model Unlearning

Junkai Chen, Yuhao He, Junxiang You, Ruiqi Liu, Chenyu Wang, Shu Wu

arXiv:2606.0010572.1

Predicted impact top 30% in CV · last 90 daysOriginality Incremental advance

AI Analysis

For practitioners needing to remove undesirable knowledge from MLLMs without retraining, this method offers a practical balance between unlearning effectiveness and model utility.

The paper tackles the problem of unlearning sensitive knowledge in Multimodal Large Language Models (MLLMs). The proposed VGID method achieves strong unlearning effectiveness with minimal utility loss, reducing forget set ROUGE-L by 0.371 while retaining set ROUGE-L drops only 0.055.

Multimodal Large Language Models (MLLMs) have achieved remarkable progress on vision-language tasks, but they may also memorize and expose sensitive or restricted knowledge, raising concerns about privacy and broader safety risks. Machine Unlearning (MU) provides a promising way to remove targeted undesirable knowledge from trained models without retraining from scratch while preserving general model utility. Nevertheless, effective unlearning in MLLMs remains particularly challenging. Existing training-based methods often struggle to balance unlearning effectiveness and model utility. In contrast, training-free methods such as in-context unlearning preserve model utility by avoiding parameter updates, but they do not remove memorized knowledge at the parameter level and may remain vulnerable to reverse-engineering attacks. More importantly, in-context unlearning is insufficient in multimodal settings, where visual inputs can provide strong conditioning signals and induce undesirable outputs. To address these challenges, we propose Visual-Noise Guided In-Context Distillation (VGID), a distillation-based framework for MLLM unlearning. VGID dynamically constructs an unlearning-oriented teacher distribution from the frozen base model through dual-modal intervention that combines visual perturbation with textual in-context unlearning. The resulting intervention-induced distribution serves as a teacher signal for distillation, guiding the student model toward parameter-level unlearning without requiring external teacher models or explicit undesirable response annotations. Experimental results show that VGID achieves strong unlearning effectiveness while preserving competitive model utility, reducing forget set ROUGE-L by 0.371 with only a 0.055 drop in retain set ROUGE-L in a representative setting.

View on arXiv PDF

Similar