Tight Complexity Bounds for Optimizing Composite Objectives
This work provides foundational theoretical insights for optimization in machine learning, addressing efficiency in large-scale problems.
The paper establishes tight complexity bounds for minimizing composite convex objectives, showing a significant gap between deterministic and randomized optimization, with accelerated gradient descent and an accelerated SVRG variant proven optimal for smooth functions, and prox-based methods achieving optimal rates for non-smooth cases.
We provide tight upper and lower bounds on the complexity of minimizing the average of $m$ convex functions using gradient and prox oracles of the component functions. We show a significant gap between the complexity of deterministic vs randomized optimization. For smooth functions, we show that accelerated gradient descent (AGD) and an accelerated variant of SVRG are optimal in the deterministic and randomized settings respectively, and that a gradient oracle is sufficient for the optimal rate. For non-smooth functions, having access to prox oracles reduces the complexity and we present optimal methods based on smoothing that improve over methods using just gradient accesses.