LGCRMLJul 16, 2020

Data Poisoning Attacks Against Federated Learning Systems

arXiv:2007.08432v2909 citations
AI Analysis

This addresses security vulnerabilities in federated learning, which is crucial for privacy-preserving distributed AI, but the defense is incremental as it builds on existing mitigation approaches.

The paper investigates targeted data poisoning attacks in federated learning systems, showing that even a small percentage of malicious participants can cause substantial drops in classification accuracy and recall, and proposes a defense strategy to identify such participants.

Federated learning (FL) is an emerging paradigm for distributed training of large-scale deep neural networks in which participants' data remains on their own devices with only model updates being shared with a central server. However, the distributed nature of FL gives rise to new threats caused by potentially malicious participants. In this paper, we study targeted data poisoning attacks against FL systems in which a malicious subset of the participants aim to poison the global model by sending model updates derived from mislabeled data. We first demonstrate that such data poisoning attacks can cause substantial drops in classification accuracy and recall, even with a small percentage of malicious participants. We additionally show that the attacks can be targeted, i.e., they have a large negative impact only on classes that are under attack. We also study attack longevity in early/late round training, the impact of malicious participant availability, and the relationships between the two. Finally, we propose a defense strategy that can help identify malicious participants in FL to circumvent poisoning attacks, and demonstrate its effectiveness.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes