LGDCITSPJul 23, 2021

Device Scheduling and Update Aggregation Policies for Asynchronous Federated Learning

arXiv:2107.11415v135 citations
Originality Incremental advance
AI Analysis

This work addresses efficiency issues in federated learning for distributed systems with heterogeneous devices, but it is incremental as it builds on existing asynchronous FL methods.

The paper tackles the straggler problem in federated learning by proposing an asynchronous framework with periodic aggregation, showing that device scheduling and update aggregation policies differ from synchronous cases, with age-aware weighting improving learning performance.

Federated Learning (FL) is a newly emerged decentralized machine learning (ML) framework that combines on-device local training with server-based model synchronization to train a centralized ML model over distributed nodes. In this paper, we propose an asynchronous FL framework with periodic aggregation to eliminate the straggler issue in FL systems. For the proposed model, we investigate several device scheduling and update aggregation policies and compare their performances when the devices have heterogeneous computation capabilities and training data distributions. From the simulation results, we conclude that the scheduling and aggregation design for asynchronous FL can be rather different from the synchronous case. For example, a norm-based significance-aware scheduling policy might not be efficient in an asynchronous FL setting, and an appropriate "age-aware" weighting design for the model aggregation can greatly improve the learning performance of such systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes