From Local SGD to Local Fixed-Point Methods for Federated Learning
This work addresses communication efficiency in federated learning, which is crucial for mobile devices, but appears incremental as it builds on existing local SGD methods.
The paper tackles the problem of finding a fixed point of an average of operators in a distributed setting, motivated by federated learning, and proposes two strategies to limit communication bottlenecks, with convergence analysis and experiments showing benefits.
Most algorithms for solving optimization problems or finding saddle points of convex-concave functions are fixed-point algorithms. In this work we consider the generic problem of finding a fixed point of an average of operators, or an approximation thereof, in a distributed setting. Our work is motivated by the needs of federated learning. In this context, each local operator models the computations done locally on a mobile device. We investigate two strategies to achieve such a consensus: one based on a fixed number of local steps, and the other based on randomized computations. In both cases, the goal is to limit communication of the locally-computed variables, which is often the bottleneck in distributed frameworks. We perform convergence analysis of both methods and conduct a number of experiments highlighting the benefits of our approach.