LGSYJan 5

Distributed Federated Learning by Alternating Periods of Training

arXiv:2601.01793v1h-index: 7
Originality Incremental advance
AI Analysis

This addresses scalability and fault-tolerance problems for federated learning systems with many clients, though it appears incremental.

The paper tackles the scalability and fault-tolerance issues in federated learning by proposing a distributed approach with multiple servers and inter-server communication, showing that servers converge to a common model within a small tolerance of the ideal model.

Federated learning is a privacy-focused approach towards machine learning where models are trained on client devices with locally available data and aggregated at a central server. However, the dependence on a single central server is challenging in the case of a large number of clients and even poses the risk of a single point of failure. To address these critical limitations of scalability and fault-tolerance, we present a distributed approach to federated learning comprising multiple servers with inter-server communication capabilities. While providing a fully decentralized approach, the designed framework retains the core federated learning structure where each server is associated with a disjoint set of clients with server-client communication capabilities. We propose a novel DFL (Distributed Federated Learning) algorithm which uses alternating periods of local training on the client data followed by global training among servers. We show that the DFL algorithm, under a suitable choice of parameters, ensures that all the servers converge to a common model value within a small tolerance of the ideal model, thus exhibiting effective integration of local and global training models. Finally, we illustrate our theoretical claims through numerical simulations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes