On Second-order Optimization Methods for Federated Learning
This work addresses optimization challenges in federated learning for distributed data settings, but it is incremental as it builds on existing methods with a new variant.
The paper tackles the problem of evaluating second-order optimization methods in federated learning, finding that Federated Averaging performs surprisingly well compared to these methods under fair metrics, and proposes a novel variant using second-order local information with a global line search.
We consider federated learning (FL), where the training data is distributed across a large number of clients. The standard optimization method in this setting is Federated Averaging (FedAvg), which performs multiple local first-order optimization steps between communication rounds. In this work, we evaluate the performance of several second-order distributed methods with local steps in the FL setting which promise to have favorable convergence properties. We (i) show that FedAvg performs surprisingly well against its second-order competitors when evaluated under fair metrics (equal amount of local computations)-in contrast to the results of previous work. Based on our numerical study, we propose (ii) a novel variant that uses second-order local information for updates and a global line search to counteract the resulting local specificity.