Optimal Algorithms for Distributed Optimization
This work addresses efficient optimization in distributed systems, such as machine learning or sensor networks, by providing theoretical guarantees for convergence, though it appears incremental as it builds on existing methods like Nesterov's acceleration.
The paper tackles the problem of distributed convex optimization under network communication constraints, establishing optimal convergence rates for various function classes and showing that distributed accelerated gradient descent achieves rates comparable to centralized methods up to constant or logarithmic factors.
In this paper, we study the optimal convergence rate for distributed convex optimization problems in networks. We model the communication restrictions imposed by the network as a set of affine constraints and provide optimal complexity bounds for four different setups, namely: the function $F(\xb) \triangleq \sum_{i=1}^{m}f_i(\xb)$ is strongly convex and smooth, either strongly convex or smooth or just convex. Our results show that Nesterov's accelerated gradient descent on the dual problem can be executed in a distributed manner and obtains the same optimal rates as in the centralized version of the problem (up to constant or logarithmic factors) with an additional cost related to the spectral gap of the interaction matrix. Finally, we discuss some extensions to the proposed setup such as proximal friendly functions, time-varying graphs, improvement of the condition numbers.