LG AIOct 18, 2021

Towards General Deep Leakage in Federated Learning

Jiahui Geng, Yongli Mou, Feifei Li, Qing Li, Oya Beyan, Stefan Decker, Chunming Rong

arXiv:2110.09074v218.966 citations

Originality Incremental advance

AI Analysis

This addresses privacy vulnerabilities in federated learning systems, which is critical for applications handling sensitive user data, though it appears incremental by extending existing reconstruction attacks.

The paper tackles the problem of data leakage in federated learning by developing methods to reconstruct training data from shared gradients or weights, breaking through unrealistic assumptions to apply attacks in broader scenarios. The results show that their approach exceeds the state-of-the-art GradInversion in batch size, image quality, and label distribution adaptability on benchmarks like CIFAR-10 and ImageNet.

Unlike traditional central training, federated learning (FL) improves the performance of the global model by sharing and aggregating local models rather than local data to protect the users' privacy. Although this training approach appears secure, some research has demonstrated that an attacker can still recover private data based on the shared gradient information. This on-the-fly reconstruction attack deserves to be studied in depth because it can occur at any stage of training, whether at the beginning or at the end of model training; no relevant dataset is required and no additional models need to be trained. We break through some unrealistic assumptions and limitations to apply this reconstruction attack in a broader range of scenarios. We propose methods that can reconstruct the training data from shared gradients or weights, corresponding to the FedSGD and FedAvg usage scenarios, respectively. We propose a zero-shot approach to restore labels even if there are duplicate labels in the batch. We study the relationship between the label and image restoration. We find that image restoration fails even if there is only one incorrectly inferred label in the batch; we also find that when batch images have the same label, the corresponding image is restored as a fusion of that class of images. Our approaches are evaluated on classic image benchmarks, including CIFAR-10 and ImageNet. The batch size, image quality, and the adaptability of the label distribution of our approach exceed those of GradInversion, the state-of-the-art.

View on arXiv PDF

Similar