LGOCFeb 3, 2025

qNBO: quasi-Newton Meets Bilevel Optimization

arXiv:2502.01076v12 citationsh-index: 6ICLR
Originality Incremental advance
AI Analysis

This work addresses computational bottlenecks in bilevel optimization for machine learning practitioners, offering an incremental improvement over existing methods.

The paper tackles the computational challenges in bilevel optimization, such as computing the exact lower-level solution and inverse Hessian, by introducing a coordinated framework using quasi-Newton methods, resulting in comparable or superior performance in tasks like hyperparameter optimization and meta-learning.

Bilevel optimization, addressing challenges in hierarchical learning tasks, has gained significant interest in machine learning. The practical implementation of the gradient descent method to bilevel optimization encounters computational hurdles, notably the computation of the exact lower-level solution and the inverse Hessian of the lower-level objective. Although these two aspects are inherently connected, existing methods typically handle them separately by solving the lower-level problem and a linear system for the inverse Hessian-vector product. In this paper, we introduce a general framework to address these computational challenges in a coordinated manner. Specifically, we leverage quasi-Newton algorithms to accelerate the resolution of the lower-level problem while efficiently approximating the inverse Hessian-vector product. Furthermore, by exploiting the superlinear convergence properties of BFGS, we establish the non-asymptotic convergence analysis of the BFGS adaptation within our framework. Numerical experiments demonstrate the comparable or superior performance of the proposed algorithms in real-world learning tasks, including hyperparameter optimization, data hyper-cleaning, and few-shot meta-learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes