FedVARP: Tackling the Variance Due to Partial Client Participation in Federated Learning
This addresses a key bottleneck in data-heterogeneous federated learning systems, offering a practical solution for improving efficiency in distributed edge computing.
The paper tackles the problem of convergence error due to partial client participation in federated learning, proposing FedVARP and ClusterFedVARP algorithms that eliminate this error and outperform state-of-the-art methods with reduced memory requirements.
Data-heterogeneous federated learning (FL) systems suffer from two significant sources of convergence error: 1) client drift error caused by performing multiple local optimization steps at clients, and 2) partial client participation error caused by the fact that only a small subset of the edge clients participate in every training round. We find that among these, only the former has received significant attention in the literature. To remedy this, we propose FedVARP, a novel variance reduction algorithm applied at the server that eliminates error due to partial client participation. To do so, the server simply maintains in memory the most recent update for each client and uses these as surrogate updates for the non-participating clients in every round. Further, to alleviate the memory requirement at the server, we propose a novel clustering-based variance reduction algorithm ClusterFedVARP. Unlike previously proposed methods, both FedVARP and ClusterFedVARP do not require additional computation at clients or communication of additional optimization parameters. Through extensive experiments, we show that FedVARP outperforms state-of-the-art methods, and ClusterFedVARP achieves performance comparable to FedVARP with much less memory requirements.