SDBA: A Stealthy and Long-Lasting Durable Backdoor Attack in Federated Learning
This addresses a critical security problem for NLP-based federated learning systems, highlighting an urgent need for robust defenses, though it is incremental as it builds on prior backdoor attack research.
The paper tackles the vulnerability of federated learning to backdoor attacks in NLP tasks by introducing SDBA, a stealthy and long-lasting attack mechanism, which outperforms existing methods in durability and bypasses defenses across models like LSTM, GPT-2, and T5.
Federated learning is a promising approach for training machine learning models while preserving data privacy. However, its distributed nature makes it vulnerable to backdoor attacks, particularly in NLP tasks, where related research remains limited. This paper introduces SDBA, a novel backdoor attack mechanism designed for NLP tasks in federated learning environments. Through a systematic analysis across LSTM and GPT-2 models, we identify the most vulnerable layers for backdoor injection and achieve both stealth and long-lasting durability by applying layer-wise gradient masking and top-k% gradient masking. Also, to evaluate the task generalizability of SDBA, we additionally conduct experiments on the T5 model. Experiments on next-token prediction, sentiment analysis, and question answering tasks show that SDBA outperforms existing backdoors in terms of durability and effectively bypasses representative defense mechanisms, demonstrating notable performance in transformer-based models such as GPT-2. These results highlight the urgent need for robust defense strategies in NLP-based federated learning systems.