Rewind-to-Delete: Certified Machine Unlearning for Nonconvex Functions
This addresses the need for practical data removal in models to respect privacy rights like the 'right to be forgotten', with incremental improvements in handling nonconvex functions.
The paper tackles the problem of efficiently removing data from machine learning models without full retraining, proposing a first-order black-box algorithm for certified unlearning on general nonconvex loss functions, with proven privacy-utility-complexity tradeoffs and superior experimental performance compared to existing methods.
Machine unlearning algorithms aim to efficiently remove data from a model without retraining it from scratch, in order to remove corrupted or outdated data or respect a user's ``right to be forgotten." Certified machine unlearning is a strong theoretical guarantee based on differential privacy that quantifies the extent to which an algorithm erases data from the model weights. In contrast to existing works in certified unlearning for convex or strongly convex loss functions, or nonconvex objectives with limiting assumptions, we propose the first, first-order, black-box (i.e., can be applied to models pretrained with vanilla gradient descent) algorithm for unlearning on general nonconvex loss functions, which unlearns by ``rewinding" to an earlier step during the learning process before performing gradient descent on the loss function of the retained data points. We prove $(ε, δ)$ certified unlearning and performance guarantees that establish the privacy-utility-complexity tradeoff of our algorithm, and we prove generalization guarantees for functions that satisfy the Polyak-Lojasiewicz inequality. Finally, we demonstrate the superior performance of our algorithm compared to existing methods, within a new experimental framework that more accurately reflects unlearning user data in practice.