The Right to be Forgotten in Federated Learning: An Efficient Realization with Rapid Retraining
This addresses the right to be forgotten for data holders in federated learning, an incremental improvement over centralized unlearning methods.
The paper tackles the problem of machine unlearning in federated learning systems, where data holders cannot share training data, by proposing a rapid retraining approach that efficiently erases data samples while preserving model utility, as demonstrated through evaluations on four real-world datasets.
In Machine Learning, the emergence of \textit{the right to be forgotten} gave birth to a paradigm named \textit{machine unlearning}, which enables data holders to proactively erase their data from a trained model. Existing machine unlearning techniques focus on centralized training, where access to all holders' training data is a must for the server to conduct the unlearning process. It remains largely underexplored about how to achieve unlearning when full access to all training data becomes unavailable. One noteworthy example is Federated Learning (FL), where each participating data holder trains locally, without sharing their training data to the central server. In this paper, we investigate the problem of machine unlearning in FL systems. We start with a formal definition of the unlearning problem in FL and propose a rapid retraining approach to fully erase data samples from a trained FL model. The resulting design allows data holders to jointly conduct the unlearning process efficiently while keeping their training data locally. Our formal convergence and complexity analysis demonstrate that our design can preserve model utility with high efficiency. Extensive evaluations on four real-world datasets illustrate the effectiveness and performance of our proposed realization.