Byzantine-Resilient Federated Learning at Edge
This work addresses a critical issue for edge computing systems where irregular data and malicious devices can degrade learning performance, though it appears incremental by building on existing methods for Byzantine resilience and compression.
The paper tackles the problem of federated learning at the edge with heavy-tailed data, proposing algorithms that achieve Byzantine resilience, communication efficiency, and optimal statistical error rates, with experiments verifying their efficacy on synthetic and real-world datasets.
Both Byzantine resilience and communication efficiency have attracted tremendous attention recently for their significance in edge federated learning. However, most existing algorithms may fail when dealing with real-world irregular data that behaves in a heavy-tailed manner. To address this issue, we study the stochastic convex and non-convex optimization problem for federated learning at edge and show how to handle heavy-tailed data while retaining the Byzantine resilience, communication efficiency and the optimal statistical error rates simultaneously. Specifically, we first present a Byzantine-resilient distributed gradient descent algorithm that can handle the heavy-tailed data and meanwhile converge under the standard assumptions. To reduce the communication overhead, we further propose another algorithm that incorporates gradient compression techniques to save communication costs during the learning process. Theoretical analysis shows that our algorithms achieve order-optimal statistical error rate in presence of Byzantine devices. Finally, we conduct extensive experiments on both synthetic and real-world datasets to verify the efficacy of our algorithms.