Optimality in Decentralized Optimization under Bandwidth Constraints
This work addresses a critical bottleneck in distributed machine learning for scenarios with limited communication bandwidth, offering incremental improvements over prior methods.
The paper tackles the problem of decentralized optimization under bandwidth constraints by deriving optimal time complexities for non-convex stochastic parallel and asynchronous optimization, achieving results characterized in terms of min-cut/max-flow quantities with tighter and more practical complexities.
We consider a realistic decentralized setup with bandwidth-constrained communication and derive optimal time complexities for non-convex stochastic parallel and asynchronous optimization (up to logarithmic factors). We develop the corresponding methods, Grace SGD and Leon SGD, for both homogeneous and heterogeneous settings. Unlike previous work, our optimal bounds are characterized in terms of min-cut/max-flow quantities and rely on tools from Gomory-Hu trees and Steiner Tree Packing problems, providing tighter and more practical complexities.