OCLGMLFeb 20, 2018

Stochastic Variance-Reduced Cubic Regularization for Nonconvex Optimization

arXiv:1802.07372v254 citations
AI Analysis

This work addresses computational efficiency for large-scale nonconvex optimization problems, such as in machine learning, by reducing sample complexity, though it is incremental as it builds on existing CR methods.

The paper tackles the high sample complexity issue of cubic regularization (CR) for nonconvex finite-sum optimization by proposing a stochastic variance-reduced CR (SVRC) method, achieving a total Hessian sample complexity of O(N^{2/3} ε^{-3/2}) and outperforming state-of-art results by a factor of O(N^{2/15}).

Cubic regularization (CR) is an optimization method with emerging popularity due to its capability to escape saddle points and converge to second-order stationary solutions for nonconvex optimization. However, CR encounters a high sample complexity issue for finite-sum problems with a large data size. %Various inexact variants of CR have been proposed to improve the sample complexity. In this paper, we propose a stochastic variance-reduced cubic-regularization (SVRC) method under random sampling, and study its convergence guarantee as well as sample complexity. We show that the iteration complexity of SVRC for achieving a second-order stationary solution within $ε$ accuracy is $O(ε^{-3/2})$, which matches the state-of-art result on CR types of methods. Moreover, our proposed variance reduction scheme significantly reduces the per-iteration sample complexity. The resulting total Hessian sample complexity of our SVRC is ${\Oc}(N^{2/3} ε^{-3/2})$, which outperforms the state-of-art result by a factor of $O(N^{2/15})$. We also study our SVRC under random sampling without replacement scheme, which yields a lower per-iteration sample complexity, and hence justifies its practical applicability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes