OCLGMar 21

SPABA: A Single-Loop and Probabilistic Stochastic Bilevel Algorithm Achieving Optimal Sample Complexity

arXiv:2405.1877769.68 citationsh-index: 6
Predicted impact top 3% in OC · last 90 daysOriginality Incremental advance
AI Analysis

This resolves a theoretical gap in machine learning optimization, providing efficient algorithms for large-scale nested problems, though it is incremental as it adapts an existing method to a specific setting.

The paper tackles the open question of whether optimal complexity bounds for stochastic bilevel optimization match those of single-level optimization, showing that SPABA achieves optimal sample complexity in both finite-sum and expectation settings, with numerical experiments confirming superior performance.

While stochastic bilevel optimization methods have been extensively studied for addressing large-scale nested optimization problems in machine learning, it remains an open question whether the optimal complexity bounds for solving bilevel optimization are the same as those in single-level optimization. Our main result resolves this question: SPABA, an adaptation of the PAGE method for nonconvex optimization in (Li et al., 2021) to the bilevel setting, can achieve optimal sample complexity in both the finite-sum and expectation settings. We show the optimality of SPABA by proving that there is no gap in complexity analysis between stochastic bilevel and single-level optimization when implementing PAGE. Notably, as indicated by the results of (Dagréou et al., 2022), there might exist a gap in complexity analysis when implementing other stochastic gradient estimators, like SGD and SAGA. In addition to SPABA, we propose several other single-loop stochastic bilevel algorithms, that either match or improve the state-of-the-art sample complexity results, leveraging our convergence rate and complexity analysis. Numerical experiments demonstrate the superior practical performance of the proposed methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes