LGNAOCSTNov 3, 2025

Stochastic Regret Guarantees for Online Zeroth- and First-Order Bilevel Optimization

arXiv:2511.01126v14 citationsh-index: 47
Originality Highly original
AI Analysis

This addresses the need for more robust online bilevel optimization methods in machine learning applications like parametric loss tuning and adversarial attacks.

The paper tackles the problem of online bilevel optimization where current deterministic methods may fail with rapidly changing functions, and introduces a novel search direction that enables both first- and zeroth-order stochastic algorithms to achieve sublinear stochastic bilevel regret without window smoothing.

Online bilevel optimization (OBO) is a powerful framework for machine learning problems where both outer and inner objectives evolve over time, requiring dynamic updates. Current OBO approaches rely on deterministic \textit{window-smoothed} regret minimization, which may not accurately reflect system performance when functions change rapidly. In this work, we introduce a novel search direction and show that both first- and zeroth-order (ZO) stochastic OBO algorithms leveraging this direction achieve sublinear {stochastic bilevel regret without window smoothing}. Beyond these guarantees, our framework enhances efficiency by: (i) reducing oracle dependence in hypergradient estimation, (ii) updating inner and outer variables alongside the linear system solution, and (iii) employing ZO-based estimation of Hessians, Jacobians, and gradients. Experiments on online parametric loss tuning and black-box adversarial attacks validate our approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes