Invariant Aggregator for Defending against Federated Backdoor Attacks
This addresses a critical security vulnerability in federated learning for applications like healthcare or finance, though it is an incremental improvement over existing defenses.
The paper tackles the problem of defending against backdoor attacks in federated learning, where malicious clients can embed triggers without degrading overall model utility, and proposes an invariant aggregator that mitigates these attacks with negligible impact on utility, as demonstrated empirically across three datasets.
Federated learning enables training high-utility models across several clients without directly sharing their private data. As a downside, the federated setting makes the model vulnerable to various adversarial attacks in the presence of malicious clients. Despite the theoretical and empirical success in defending against attacks that aim to degrade models' utility, defense against backdoor attacks that increase model accuracy on backdoor samples exclusively without hurting the utility on other samples remains challenging. To this end, we first analyze the failure modes of existing defenses over a flat loss landscape, which is common for well-designed neural networks such as Resnet (He et al., 2015) but is often overlooked by previous works. Then, we propose an invariant aggregator that redirects the aggregated update to invariant directions that are generally useful via selectively masking out the update elements that favor few and possibly malicious clients. Theoretical results suggest that our approach provably mitigates backdoor attacks and remains effective over flat loss landscapes. Empirical results on three datasets with different modalities and varying numbers of clients further demonstrate that our approach mitigates a broad class of backdoor attacks with a negligible cost on the model utility.