Descent-to-Delete: Gradient-Based Methods for Machine Unlearning
This addresses the need for efficient and secure data removal in machine learning systems, offering a novel approach to machine unlearning with practical implications for privacy and compliance.
The paper tackles the data deletion problem for convex models by introducing gradient-based algorithms that handle arbitrary adversarial update sequences with bounded per-deletion runtime and steady-state error, without growth over time.
We study the data deletion problem for convex models. By leveraging techniques from convex optimization and reservoir sampling, we give the first data deletion algorithms that are able to handle an arbitrarily long sequence of adversarial updates while promising both per-deletion run-time and steady-state error that do not grow with the length of the update sequence. We also introduce several new conceptual distinctions: for example, we can ask that after a deletion, the entire state maintained by the optimization algorithm is statistically indistinguishable from the state that would have resulted had we retrained, or we can ask for the weaker condition that only the observable output is statistically indistinguishable from the observable output that would have resulted from retraining. We are able to give more efficient deletion algorithms under this weaker deletion criterion.