LG DCOct 30, 2024

Federated Learning under Periodic Client Participation and Heterogeneous Data: A New Communication-Efficient Algorithm and Analysis

arXiv:2410.23131v313.412 citationsh-index: 8Has CodeNIPS

Originality Incremental advance

AI Analysis

This addresses communication efficiency and scalability in federated learning for applications with intermittent client availability, though it is incremental over existing methods.

The paper tackles federated learning under realistic periodic client participation and heterogeneous data by proposing Amplified SCAFFOLD, which achieves linear speedup and reduces communication rounds to O(ε^{-2}) for cyclic participation, compared to prior O(κ^2 ε^{-4}) rounds.

In federated learning, it is common to assume that clients are always available to participate in training, which may not be feasible with user devices in practice. Recent works analyze federated learning under more realistic participation patterns, such as cyclic client availability or arbitrary participation. However, all such works either require strong assumptions (e.g., all clients participate almost surely within a bounded window), do not achieve linear speedup and reduced communication rounds, or are not applicable in the general non-convex setting. In this work, we focus on nonconvex optimization and consider participation patterns in which the chance of participation over a fixed window of rounds is equal among all clients, which includes cyclic client availability as a special case. Under this setting, we propose a new algorithm, named Amplified SCAFFOLD, and prove that it achieves linear speedup, reduced communication, and resilience to data heterogeneity simultaneously. In particular, for cyclic participation, our algorithm is proved to enjoy $\mathcal{O}(ε^{-2})$ communication rounds to find an $ε$-stationary point in the non-convex stochastic setting. In contrast, the prior work under the same setting requires $\mathcal{O}(κ^2 ε^{-4})$ communication rounds, where $κ$ denotes the data heterogeneity. Therefore, our algorithm significantly reduces communication rounds due to better dependency in terms of $ε$ and $κ$. Our analysis relies on a fine-grained treatment of the nested dependence between client participation and errors in the control variates, which results in tighter guarantees than previous work. We also provide experimental results with (1) synthetic data and (2) real-world data with a large number of clients $(N = 250)$, demonstrating the effectiveness of our algorithm under periodic client participation.

View on arXiv PDF Code

Similar