OCAILGJun 25, 2024

Double Momentum Method for Lower-Level Constrained Bilevel Optimization

arXiv:2406.17386v11 citations
Originality Highly original
AI Analysis

This addresses a bottleneck in bilevel optimization for machine learning applications by providing a more efficient and theoretically grounded method for constrained problems.

The paper tackles the problem of lower-level constrained bilevel optimization by proposing a new hypergradient method that avoids restrictive differentiability assumptions and a single-loop algorithm with double momentum, achieving convergence to a stationary point in O(d₂²ε⁻⁴) iterations.

Bilevel optimization (BO) has recently gained prominence in many machine learning applications due to its ability to capture the nested structure inherent in these problems. Recently, many hypergradient methods have been proposed as effective solutions for solving large-scale problems. However, current hypergradient methods for the lower-level constrained bilevel optimization (LCBO) problems need very restrictive assumptions, namely, where optimality conditions satisfy the differentiability and invertibility conditions and lack a solid analysis of the convergence rate. What's worse, existing methods require either double-loop updates, which are sometimes less efficient. To solve this problem, in this paper, we propose a new hypergradient of LCBO leveraging the theory of nonsmooth implicit function theorem instead of using the restrive assumptions. In addition, we propose a \textit{single-loop single-timescale} algorithm based on the double-momentum method and adaptive step size method and prove it can return a $(δ, ε)$-stationary point with $\tilde{\mathcal{O}}(d_2^2ε^{-4})$ iterations. Experiments on two applications demonstrate the effectiveness of our proposed method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes