Multi-Level Composite Stochastic Optimization via Nested Variance Reduction
This work addresses optimization challenges in machine learning and operations research by improving efficiency for complex composite functions, representing an incremental advance over prior methods.
The paper tackles multi-level composite stochastic optimization problems by proposing a normalized proximal approximate gradient method with nested variance reduction, achieving a total sample complexity of O(ε^{-3}) for expectation cases and O(N + √N ε^{-2}) for finite-sum cases, with polynomial dependence on composition levels instead of exponential.
We consider multi-level composite optimization problems where each mapping in the composition is the expectation over a family of random smooth mappings or the sum of some finite number of smooth mappings. We present a normalized proximal approximate gradient (NPAG) method where the approximate gradients are obtained via nested stochastic variance reduction. In order to find an approximate stationary point where the expected norm of its gradient mapping is less than $ε$, the total sample complexity of our method is $O(ε^{-3})$ in the expectation case, and $O(N+\sqrt{N}ε^{-2})$ in the finite-sum case where $N$ is the total number of functions across all composition levels. In addition, the dependence of our total sample complexity on the number of composition levels is polynomial, rather than exponential as in previous work.