CRLGOct 13, 2024

Efficient Federated Unlearning under Plausible Deniability

arXiv:2410.09947v17 citationsh-index: 6Mach learn
Originality Incremental advance
AI Analysis

It addresses privacy compliance in federated learning for users under regulations like GDPR, though it is incremental as it builds on existing unlearning concepts.

This paper tackles the problem of efficiently removing client data from federated learning models to comply with privacy regulations, by introducing a method that allows the server to plausibly deny client participation and reduce memory usage by 30 times and retraining time by 1.6-500769 times.

Privacy regulations like the GDPR in Europe and the CCPA in the US allow users the right to remove their data ML applications. Machine unlearning addresses this by modifying the ML parameters in order to forget the influence of a specific data point on its weights. Recent literature has highlighted that the contribution from data point(s) can be forged with some other data points in the dataset with probability close to one. This allows a server to falsely claim unlearning without actually modifying the model's parameters. However, in distributed paradigms such as FL, where the server lacks access to the dataset and the number of clients are limited, claiming unlearning in such cases becomes a challenge. This paper introduces an efficient way to achieve federated unlearning, by employing a privacy model which allows the FL server to plausibly deny the client's participation in the training up to a certain extent. We demonstrate that the server can generate a Proof-of-Deniability, where each aggregated update can be associated with at least x number of client updates. This enables the server to plausibly deny a client's participation. However, in the event of frequent unlearning requests, the server is required to adopt an unlearning strategy and, accordingly, update its model parameters. We also perturb the client updates in a cluster in order to avoid inference from an honest but curious server. We show that the global model satisfies differential privacy after T number of communication rounds. The proposed methodology has been evaluated on multiple datasets in different privacy settings. The experimental results show that our framework achieves comparable utility while providing a significant reduction in terms of memory (30 times), as well as retraining time (1.6-500769 times). The source code for the paper is available.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes