On the Oracle Complexity of Higher-Order Smooth Non-Convex Finite-Sum Optimization
This work addresses theoretical limitations in optimization algorithms for non-convex problems, providing foundational insights for researchers in machine learning and optimization, though it is incremental in refining existing lower bounds.
The paper establishes lower bounds for higher-order methods in smooth non-convex finite-sum optimization, showing deterministic algorithms cannot benefit from the finite-sum structure and that simulating pth-order regularized methods is optimal. It also introduces a new second-order smoothness assumption to bridge gaps between lower and upper bounds, enabling sharper lower bounds while maintaining state-of-the-art convergence guarantees.
We prove lower bounds for higher-order methods in smooth non-convex finite-sum optimization. Our contribution is threefold: We first show that a deterministic algorithm cannot profit from the finite-sum structure of the objective, and that simulating a pth-order regularized method on the whole function by constructing exact gradient information is optimal up to constant factors. We further show lower bounds for randomized algorithms and compare them with the best known upper bounds. To address some gaps between the bounds, we propose a new second-order smoothness assumption that can be seen as an analogue of the first-order mean-squared smoothness assumption. We prove that it is sufficient to ensure state-of-the-art convergence guarantees, while allowing for a sharper lower bound.