LGAIOct 18, 2021

Towards General Deep Leakage in Federated Learning

arXiv:2110.09074v266 citations
Originality Incremental advance
AI Analysis

This addresses privacy vulnerabilities in federated learning systems, which is critical for applications handling sensitive user data, though it appears incremental by extending existing reconstruction attacks.

The paper tackles the problem of data leakage in federated learning by developing methods to reconstruct training data from shared gradients or weights, breaking through unrealistic assumptions to apply attacks in broader scenarios. The results show that their approach exceeds the state-of-the-art GradInversion in batch size, image quality, and label distribution adaptability on benchmarks like CIFAR-10 and ImageNet.

Unlike traditional central training, federated learning (FL) improves the performance of the global model by sharing and aggregating local models rather than local data to protect the users' privacy. Although this training approach appears secure, some research has demonstrated that an attacker can still recover private data based on the shared gradient information. This on-the-fly reconstruction attack deserves to be studied in depth because it can occur at any stage of training, whether at the beginning or at the end of model training; no relevant dataset is required and no additional models need to be trained. We break through some unrealistic assumptions and limitations to apply this reconstruction attack in a broader range of scenarios. We propose methods that can reconstruct the training data from shared gradients or weights, corresponding to the FedSGD and FedAvg usage scenarios, respectively. We propose a zero-shot approach to restore labels even if there are duplicate labels in the batch. We study the relationship between the label and image restoration. We find that image restoration fails even if there is only one incorrectly inferred label in the batch; we also find that when batch images have the same label, the corresponding image is restored as a fusion of that class of images. Our approaches are evaluated on classic image benchmarks, including CIFAR-10 and ImageNet. The batch size, image quality, and the adaptability of the label distribution of our approach exceed those of GradInversion, the state-of-the-art.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes