LGDCMLSep 18, 2020

Federated Learning with Nesterov Accelerated Gradient

arXiv:2009.08716v245 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of slow convergence and low accuracy in federated learning for distributed data scenarios, representing an incremental improvement over existing momentum-based methods.

The paper tackles the inefficiency of conventional federated learning by proposing FedNAG, which integrates Nesterov Accelerated Gradient into both workers and aggregators, resulting in a 3-24% increase in learning accuracy and an 11-70% reduction in total training time compared to benchmarks.

Federated learning (FL) is a fast-developing technique that allows multiple workers to train a global model based on a distributed dataset. Conventional FL (FedAvg) employs gradient descent algorithm, which may not be efficient enough. Momentum is able to improve the situation by adding an additional momentum step to accelerate the convergence and has demonstrated its benefits in both centralized and FL environments. It is well-known that Nesterov Accelerated Gradient (NAG) is a more advantageous form of momentum, but it is not clear how to quantify the benefits of NAG in FL so far. This motives us to propose FedNAG, which employs NAG in each worker as well as NAG momentum and model aggregation in the aggregator. We provide a detailed convergence analysis of FedNAG and compare it with FedAvg. Extensive experiments based on real-world datasets and trace-driven simulation are conducted, demonstrating that FedNAG increases the learning accuracy by 3-24% and decreases the total training time by 11-70% compared with the benchmarks under a wide range of settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes