LGAICVJun 3, 2025

Rethinking Post-Unlearning Behavior of Large Vision-Language Models

arXiv:2506.02541v1h-index: 5
Originality Incremental advance
AI Analysis

This addresses privacy risks in generative AI models for users and developers, though it is incremental as it builds on existing unlearning methods.

The paper tackles the problem of undesirable behaviors like degenerate or hallucinated responses in large vision-language models after applying machine unlearning for privacy, and shows that their proposed PUBG method effectively mitigates these issues while preserving privacy.

Machine unlearning is used to mitigate the privacy risks of Large Vision-Language Models (LVLMs) arising from training on large-scale web data. However, existing unlearning methods often fail to carefully select substitute outputs for forget targets, resulting in Unlearning Aftermaths-undesirable behaviors such as degenerate, hallucinated, or excessively refused responses. We highlight that, especially for generative LVLMs, it is crucial to consider the quality and informativeness of post-unlearning responses rather than relying solely on naive suppression. To address this, we introduce a new unlearning task for LVLMs that requires models to provide privacy-preserving yet informative and visually grounded responses. We also propose PUBG, a novel unlearning method that explicitly guides post-unlearning behavior toward a desirable output distribution. Experiments show that, while existing methods suffer from Unlearning Aftermaths despite successfully preventing privacy violations, PUBG effectively mitigates these issues, generating visually grounded and informative responses without privacy leakage for forgotten targets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes