LG MLNov 24, 2025

Subtract the Corruption: Training-Data-Free Corrective Machine Unlearning using Task Arithmetic

Mostafa Mozafari, Farooq Ahmad Wani, Maria Sofia Bucarelli, Fabrizio Silvestri

arXiv:2511.18660v2

Originality Highly original

AI Analysis

This addresses the challenge of corrective machine unlearning in real-world scenarios where training data is inaccessible, offering a solution for improving model safety and performance without data access.

The paper tackles the problem of removing the influence of corrupted training data from a model when the original training data is unavailable, introducing a method called CUTS that recovers a large fraction of lost utility under label noise and nearly eliminates backdoor attacks with minimal damage to utility.

Corrupted training data are ubiquitous. Corrective Machine Unlearning (CMU) seeks to remove the influence of such corruption post-training. Prior CMU typically assumes access to identified corrupted training samples (a "forget set"). However, in many real-world scenarios the training data are no longer accessible. We formalize source-free CMU, where the original training data are unavailable and, consequently, no forget set of identified corrupted training samples can be specified. Instead, we assume a small proxy (surrogate) set of corrupted samples that reflect the suspected corruption type without needing to be the original training samples. In this stricter setting, methods relying on forget set are ineffective or narrow in scope. We introduce Corrective Unlearning in Task Space (CUTS), a lightweight weight space correction method guided by the proxy set using task arithmetic principles. CUTS treats the clean and the corruption signal as distinct tasks. Specifically, we briefly fine-tune the corrupted model on the proxy to amplify the corruption mechanism in the weight space, compute the difference between the corrupted and fine-tuned weights as a proxy task vector, and subtract a calibrated multiple of this vector to cancel the corruption. Without access to clean data or a forget set, CUTS recovers a large fraction of the lost utility under label noise and, for backdoor triggers, nearly eliminates the attack with minimal damage to utility, outperforming state-of-the-art specialized CMU methods in source-free setting.

View on arXiv PDF

Similar