OCLGMLDec 6, 2023

Achieving ${O}(ε^{-1.5})$ Complexity in Hessian/Jacobian-free Stochastic Bilevel Optimization

arXiv:2312.03807v33 citationsh-index: 5
Originality Highly original
AI Analysis

This addresses a bottleneck in optimization for machine learning by improving efficiency in bilevel problems, though it is incremental as it builds on prior work in the field.

The paper tackles the problem of achieving efficient sample complexity in Hessian/Jacobian-free stochastic bilevel optimization, proposing a method called FdeHBO that requires O(ε^{-1.5}) iterations to find an ε-accurate stationary point without second-order derivative computations.

In this paper, we revisit the bilevel optimization problem, in which the upper-level objective function is generally nonconvex and the lower-level objective function is strongly convex. Although this type of problem has been studied extensively, it still remains an open question how to achieve an ${O}(ε^{-1.5})$ sample complexity in Hessian/Jacobian-free stochastic bilevel optimization without any second-order derivative computation. To fill this gap, we propose a novel Hessian/Jacobian-free bilevel optimizer named FdeHBO, which features a simple fully single-loop structure, a projection-aided finite-difference Hessian/Jacobian-vector approximation, and momentum-based updates. Theoretically, we show that FdeHBO requires ${O}(ε^{-1.5})$ iterations (each using ${O}(1)$ samples and only first-order gradient information) to find an $ε$-accurate stationary point. As far as we know, this is the first Hessian/Jacobian-free method with an ${O}(ε^{-1.5})$ sample complexity for nonconvex-strongly-convex stochastic bilevel optimization.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes