FLIP: A Provable Defense Framework for Backdoor Mitigation in Federated Learning
This work addresses security vulnerabilities in federated learning for distributed systems, offering a provable defense against backdoor attacks, though it is incremental as it builds on existing robust aggregation methods.
The paper tackles backdoor attacks in federated learning by proposing a trigger reverse engineering defense, achieving robustness improvements with guaranteed reductions in attack success rate without harming benign accuracy, as demonstrated in experiments across datasets and attack settings.
Federated Learning (FL) is a distributed learning paradigm that enables different parties to train a model together for high quality and strong privacy protection. In this scenario, individual participants may get compromised and perform backdoor attacks by poisoning the data (or gradients). Existing work on robust aggregation and certified FL robustness does not study how hardening benign clients can affect the global model (and the malicious clients). In this work, we theoretically analyze the connection among cross-entropy loss, attack success rate, and clean accuracy in this setting. Moreover, we propose a trigger reverse engineering based defense and show that our method can achieve robustness improvement with guarantee (i.e., reducing the attack success rate) without affecting benign accuracy. We conduct comprehensive experiments across different datasets and attack settings. Our results on eight competing SOTA defense methods show the empirical superiority of our method on both single-shot and continuous FL backdoor attacks. Code is available at https://github.com/KaiyuanZh/FLIP.