Locally Adaptive Federated Learning
This work addresses convergence inefficiencies in federated learning for distributed machine learning systems, offering an incremental improvement over existing adaptive methods.
The paper tackled the problem of slow convergence in federated learning due to uniform stepsizes across clients by proposing locally adaptive algorithms that use client-specific geometric information. The result showed that these algorithms matched tuned FedAvg in convex settings, outperformed FedAvg and FedAMS in non-convex experiments, and achieved superior generalization performance.
Federated learning is a paradigm of distributed machine learning in which multiple clients coordinate with a central server to learn a model, without sharing their own training data. Standard federated optimization methods such as Federated Averaging (FedAvg) ensure balance among the clients by using the same stepsize for local updates on all clients. However, this means that all clients need to respect the global geometry of the function which could yield slow convergence. In this work, we propose locally adaptive federated learning algorithms, that leverage the local geometric information for each client function. We show that such locally adaptive methods with uncoordinated stepsizes across all clients can be particularly efficient in interpolated (overparameterized) settings, and analyze their convergence in the presence of heterogeneous data for convex and strongly convex settings. We validate our theoretical claims by performing illustrative experiments for both i.i.d. non-i.i.d. cases. Our proposed algorithms match the optimization performance of tuned FedAvg in the convex setting, outperform FedAvg as well as state-of-the-art adaptive federated algorithms like FedAMS for non-convex experiments, and come with superior generalization performance.