A Sub-sampled Tensor Method for Non-convex Optimization
This work addresses optimization challenges in machine learning and data science by providing a scalable stochastic method for non-convex problems, though it is incremental as it matches existing deterministic rates.
The paper tackles the problem of finding local minima for smooth non-convex objective functions with a finite-sum structure by proposing a stochastic optimization method using sub-sampled derivatives and a fourth-order regularized model. It achieves a convergence rate of O(max(ε1^{-4/3}, ε2^{-2}, ε3^{-4})) iterations to find an (ε1, ε2, ε3)-third-order critical point, matching deterministic approaches.
We present a stochastic optimization method that uses a fourth-order regularized model to find local minima of smooth and potentially non-convex objective functions with a finite-sum structure. This algorithm uses sub-sampled derivatives instead of exact quantities. The proposed approach is shown to find an $(ε_1,ε_2,ε_3)$-third-order critical point in at most $\bigO\left(\max\left(ε_1^{-4/3}, ε_2^{-2}, ε_3^{-4}\right)\right)$ iterations, thereby matching the rate of deterministic approaches. In order to prove this result, we derive a novel tensor concentration inequality for sums of tensors of any order that makes explicit use of the finite-sum structure of the objective function.