LG AIMay 2, 2024

Gradient-Congruity Guided Federated Sparse Training

Chris Xing Tian, Yibing Liu, Haoliang Li, Ray C. C. Cheung, Shiqi Wang

arXiv:2405.01189v14.63 citationsh-index: 26

Originality Incremental advance

AI Analysis

This addresses efficiency and generalization issues in federated learning for edge computing, but it is incremental as it builds on existing sparse training and gradient-based methods.

The paper tackles the challenges of high computational and communication costs and poor generalization in federated learning by proposing FedSGC, which integrates dynamic sparse training and gradient congruity inspection, achieving competitive accuracy with state-of-the-art methods while reducing overheads.

Edge computing allows artificial intelligence and machine learning models to be deployed on edge devices, where they can learn from local data and collaborate to form a global model. Federated learning (FL) is a distributed machine learning technique that facilitates this process while preserving data privacy. However, FL also faces challenges such as high computational and communication costs regarding resource-constrained devices, and poor generalization performance due to the heterogeneity of data across edge clients and the presence of out-of-distribution data. In this paper, we propose the Gradient-Congruity Guided Federated Sparse Training (FedSGC), a novel method that integrates dynamic sparse training and gradient congruity inspection into federated learning framework to address these issues. Our method leverages the idea that the neurons, in which the associated gradients with conflicting directions with respect to the global model contain irrelevant or less generalized information for other clients, and could be pruned during the sparse training process. Conversely, the neurons where the associated gradients with consistent directions could be grown in a higher priority. In this way, FedSGC can greatly reduce the local computation and communication overheads while, at the same time, enhancing the generalization abilities of FL. We evaluate our method on challenging non-i.i.d settings and show that it achieves competitive accuracy with state-of-the-art FL methods across various scenarios while minimizing computation and communication costs.

View on arXiv PDF

Similar