A unified PAC-Bayesian framework for machine unlearning via information risk minimization
This work addresses the need for practical machine unlearning mechanisms, which is crucial for privacy and compliance in AI systems, but it is incremental as it unifies and interprets prior methods rather than introducing a new approach.
The paper tackles the problem of efficiently removing the influence of specific training data from a trained model without full retraining, by developing a unified PAC-Bayesian framework that recovers two existing unlearning methods as information risk minimization problems.
Machine unlearning refers to mechanisms that can remove the influence of a subset of training data upon request from a trained model without incurring the cost of re-training from scratch. This paper develops a unified PAC-Bayesian framework for machine unlearning that recovers the two recent design principles - variational unlearning (Nguyen et.al., 2020) and forgetting Lagrangian (Golatkar et.al., 2020) - as information risk minimization problems (Zhang,2006). Accordingly, both criteria can be interpreted as PAC-Bayesian upper bounds on the test loss of the unlearned model that take the form of free energy metrics.