LGCRCVFeb 1, 2022

Fishing for User Data in Large-Batch Federated Learning via Gradient Magnification

arXiv:2202.00580v2120 citations
AI Analysis

This addresses privacy risks for users in federated learning systems by enabling data extraction attacks in practical, large-scale environments, representing a significant advancement over previous limited attacks.

The paper tackled the problem of privacy vulnerabilities in federated learning by developing a strategy to recover user data from gradient updates in realistic large-batch settings without architectural modifications, achieving high-fidelity data extraction in cross-device and cross-silo scenarios.

Federated learning (FL) has rapidly risen in popularity due to its promise of privacy and efficiency. Previous works have exposed privacy vulnerabilities in the FL pipeline by recovering user data from gradient updates. However, existing attacks fail to address realistic settings because they either 1) require toy settings with very small batch sizes, or 2) require unrealistic and conspicuous architecture modifications. We introduce a new strategy that dramatically elevates existing attacks to operate on batches of arbitrarily large size, and without architectural modifications. Our model-agnostic strategy only requires modifications to the model parameters sent to the user, which is a realistic threat model in many scenarios. We demonstrate the strategy in challenging large-scale settings, obtaining high-fidelity data extraction in both cross-device and cross-silo federated learning.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes