Revealing and Protecting Labels in Distributed Training
This addresses privacy risks in federated learning for users by enabling label leakage attacks and proposing defenses, though it is incremental as it builds on prior gradient-based reconstruction techniques.
The authors tackled the problem of sensitive label leakage from gradients in distributed training by proposing a method to discover training labels from the last layer gradient and label mapping, demonstrating its effectiveness across image classification and speech recognition domains and showing that gradient quantization and sparsification can mitigate the attack.
Distributed learning paradigms such as federated learning often involve transmission of model updates, or gradients, over a network, thereby avoiding transmission of private data. However, it is possible for sensitive information about the training data to be revealed from such gradients. Prior works have demonstrated that labels can be revealed analytically from the last layer of certain models (e.g., ResNet), or they can be reconstructed jointly with model inputs by using Gradients Matching [Zhu et al'19] with additional knowledge about the current state of the model. In this work, we propose a method to discover the set of labels of training samples from only the gradient of the last layer and the id to label mapping. Our method is applicable to a wide variety of model architectures across multiple domains. We demonstrate the effectiveness of our method for model training in two domains - image classification, and automatic speech recognition. Furthermore, we show that existing reconstruction techniques improve their efficacy when used in conjunction with our method. Conversely, we demonstrate that gradient quantization and sparsification can significantly reduce the success of the attack.