LGOct 1, 2021

Layer-wise and Dimension-wise Locally Adaptive Federated Learning

arXiv:2110.00532v35 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of improving training efficiency and performance in federated learning for applications like mobile devices, but it is incremental as it extends existing adaptive methods by adding layer-wise adaptivity.

The paper tackles the problem of training federated deep neural networks by proposing a novel FL framework that introduces layer-wise adaptivity to local model updates, building on dimension-wise adaptive methods like Adam. The result shows that Fed-LAMB and Mime-LAMB achieve faster convergence and better generalization performance compared to recent adaptive FL methods, with theoretical convergence matching state-of-the-art rates and linear speedup in workers.

In the emerging paradigm of Federated Learning (FL), large amount of clients such as mobile devices are used to train possibly high-dimensional models on their respective data. Combining (dimension-wise) adaptive gradient methods (e.g. Adam, AMSGrad) with FL has been an active direction, which is shown to outperform traditional SGD based FL in many cases. In this paper, we focus on the problem of training federated deep neural networks, and propose a novel FL framework which further introduces layer-wise adaptivity to the local model updates. Our framework can be applied to locally adaptive FL methods including two recent algorithms, Mime and Fed-AMS. Theoretically, we provide a convergence analysis of our layer-wise FL methods, coined Fed-LAMB and Mime-LAMB, which matches the convergence rate of state-of-the-art results in FL and exhibits linear speedup in terms of the number of workers. Experimental results on various datasets and models, under both IID and non-IID local data settings, show that both Fed-LAMB and Mime-LAMB achieve faster convergence speed and better generalization performance, compared to the various recent adaptive FL methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes