Knowledge Vector Weakening: Efficient Training-free Unlearning for Large Vision-Language Models
This work addresses privacy and safety issues in large vision-language models by providing an efficient unlearning solution, though it is incremental as it builds on existing unlearning concepts with a novel training-free approach.
The paper tackles the problem of efficiently removing specific data influences from large vision-language models to address privacy and harmful content concerns, proposing a training-free unlearning method called Knowledge Vector Weakening that achieves a stable forget-retain trade-off and significantly improves computational efficiency over existing gradient-based methods.
Large Vision-Language Models (LVLMs) are widely adopted for their strong multimodal capabilities, yet they raise serious concerns such as privacy leakage and harmful content generation. Machine unlearning has emerged as a promising solution for removing the influence of specific data from trained models. However, existing approaches largely rely on gradient-based optimization, incurring substantial computational costs for large-scale LVLMs. To address this limitation, we propose Knowledge Vector Weakening (KVW), a training-free unlearning method that directly intervenes in the full model without gradient computation. KVW identifies knowledge vectors that are activated during the model's output generation on the forget set and progressively weakens their contributions, thereby preventing the model from exploiting undesirable knowledge. Experiments on the MLLMU and CLEAR benchmarks demonstrate that KVW achieves a stable forget-retain trade-off while significantly improving computational efficiency over gradient-based and LoRA-based unlearning methods.