MLLGOCFeb 17, 2023

A Lower Bound and a Near-Optimal Algorithm for Bilevel Empirical Risk Minimization

arXiv:2302.08766v519 citationsh-index: 20
Originality Highly original
AI Analysis

This work addresses bilevel optimization problems in machine learning, which have growing applications, by developing a near-optimal algorithm with proven sample complexity.

The authors tackled the problem of bilevel empirical risk minimization by proposing a bilevel extension of the SARAH algorithm, achieving an oracle complexity of O((n+m)^(1/2)ε^(-1)) for ε-stationarity, which improves over previous methods and matches a new lower bound they provide.

Bilevel optimization problems, which are problems where two optimization problems are nested, have more and more applications in machine learning. In many practical cases, the upper and the lower objectives correspond to empirical risk minimization problems and therefore have a sum structure. In this context, we propose a bilevel extension of the celebrated SARAH algorithm. We demonstrate that the algorithm requires $\mathcal{O}((n+m)^{\frac12}\varepsilon^{-1})$ oracle calls to achieve $\varepsilon$-stationarity with $n+m$ the total number of samples, which improves over all previous bilevel algorithms. Moreover, we provide a lower bound on the number of oracle calls required to get an approximate stationary point of the objective function of the bilevel problem. This lower bound is attained by our algorithm, making it optimal in terms of sample complexity.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes