LGDCAug 23, 2021

Anarchic Federated Learning

arXiv:2108.09875v462 citations
Originality Highly original
AI Analysis

This addresses the problem of efficient federated learning in edge networks with heterogeneous workers, offering a novel paradigm that is incremental in algorithm design.

The paper tackles the challenge of flexible worker participation in federated learning due to heterogeneity in data and computing capabilities by introducing Anarchic Federated Learning (AFL), where workers can choose when to participate and how many local steps to perform. The proposed Anarchic Federated Averaging algorithms achieve the best known convergence rate and retain linear speedup, as validated on real-world datasets.

Present-day federated learning (FL) systems deployed over edge networks consists of a large number of workers with high degrees of heterogeneity in data and/or computing capabilities, which call for flexible worker participation in terms of timing, effort, data heterogeneity, etc. To satisfy the need for flexible worker participation, we consider a new FL paradigm called "Anarchic Federated Learning" (AFL) in this paper. In stark contrast to conventional FL models, each worker in AFL has the freedom to choose i) when to participate in FL, and ii) the number of local steps to perform in each round based on its current situation (e.g., battery level, communication channels, privacy concerns). However, such chaotic worker behaviors in AFL impose many new open questions in algorithm design. In particular, it remains unclear whether one could develop convergent AFL training algorithms, and if yes, under what conditions and how fast the achievable convergence speed is. Toward this end, we propose two Anarchic Federated Averaging (AFA) algorithms with two-sided learning rates for both cross-device and cross-silo settings, which are named AFA-CD and AFA-CS, respectively. Somewhat surprisingly, we show that, under mild anarchic assumptions, both AFL algorithms achieve the best known convergence rate as the state-of-the-art algorithms for conventional FL. Moreover, they retain the highly desirable {\em linear speedup effect} with respect of both the number of workers and local steps in the new AFL paradigm. We validate the proposed algorithms with extensive experiments on real-world datasets.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes