LG AIJul 2, 2025

PULSE: Practical Evaluation Scenarios for Large Multimodal Model Unlearning

Tatsuki Kawakami, Kazuki Egashira, Atsuyuki Miyai, Go Irie, Kiyoharu Aizawa

arXiv:2507.01271v49.43 citationsh-index: 8

Originality Incremental advance

AI Analysis

This work addresses privacy and copyright concerns for LMM users by providing a more realistic evaluation framework, though it is incremental as it builds on existing unlearning benchmarks.

The study tackled the lack of practical evaluation frameworks for unlearning in large multimodal models (LMMs) by introducing the PULSE protocol, which assesses unlearning across different knowledge phases and sequential requests, revealing that existing methods struggle with pre-trained knowledge and sequential unlearning, with performance degradation observed in sequential scenarios.

In recent years, unlearning techniques, which are methods for inducing a model to "forget" previously learned information, have attracted attention as a way to address privacy and copyright concerns in large language models (LLMs) and large multimodal models (LMMs). While several unlearning benchmarks have been established for LLMs, a practical evaluation framework for unlearning in LMMs has been less explored. Specifically, existing unlearning benchmark for LMMs considers only scenarios in which the model is required to unlearn fine-tuned knowledge through a single unlearning operation. In this study, we introduce PULSE protocol for realistic unlearning scenarios for LMMs by introducing two critical perspectives: (i) Pre-trained knowledge Unlearning for analyzing the effect across different knowledge acquisition phases and (ii) Long-term Sustainability Evaluation to address sequential requests. We then evaluate existing unlearning methods along these dimensions. Our results reveal that, although some techniques can successfully unlearn knowledge acquired through fine-tuning, they struggle to eliminate information learned during pre-training. Moreover, methods that effectively unlearn a batch of target data in a single operation exhibit substantial performance degradation when the same data are split and unlearned sequentially.

View on arXiv PDF

Similar