CV LGMar 27

SALMUBench: A Benchmark for Sensitive Association-Level Multimodal Unlearning

arXiv:2603.2631641.7h-index: 3

AI Analysis

This addresses the need for fine-grained unlearning evaluation in multimodal AI systems, though it is incremental as it focuses on benchmarking rather than proposing a new unlearning method.

The paper tackles the problem of removing sensitive information from multimodal models like CLIP by introducing SALMUBench, a benchmark for evaluating association-level unlearning, revealing that current methods either fail to forget effectively or over-generalize.

As multimodal models like CLIP become integral to downstream systems, the need to remove sensitive information is critical. However, machine unlearning for contrastively-trained encoders remains underexplored, and existing evaluations fail to diagnose fine-grained, association-level forgetting. We introduce SALMUBench (Sensitive Association-Level Multimodal Unlearning), a benchmark built upon a synthetic dataset of 60K persona-attribute associations and two foundational models: a Compromised model polluted with this data, and a Clean model without it. To isolate unlearning effects, both are trained from scratch on the same 400M-pair retain base, with the Compromised model additionally trained on the sensitive set. We propose a novel evaluation protocol with structured holdout sets (holdout identity, holdout association) to precisely measure unlearning efficacy and collateral damage. Our benchmark reveals that while utility-efficient deletion is feasible, current methods exhibit distinct failure modes: they either fail to forget effectively or over-generalize by erasing more than intended. SALMUBench sets a new standard for comprehensive unlearning evaluation, and we publicly release our dataset, models, evaluation scripts, and leaderboards to foster future research.

View on arXiv PDF

Similar