OCLGNov 29, 2021

Amortized Implicit Differentiation for Stochastic Bilevel Optimization

arXiv:2111.14580v382 citations
Originality Incremental advance
AI Analysis

This work addresses bilevel optimization, a problem relevant for machine learning tasks like hyper-parameter tuning, by providing a more efficient algorithm that matches oracle performance, though it is incremental as it builds on existing implicit differentiation methods.

The paper tackles bilevel optimization problems by introducing amortized algorithms based on inexact implicit differentiation and a warm-start strategy, achieving computational complexity matching oracle methods with unbiased gradient estimates and demonstrating efficiency on hyper-parameter optimization with thousands of variables.

We study a class of algorithms for solving bilevel optimization problems in both stochastic and deterministic settings when the inner-level objective is strongly convex. Specifically, we consider algorithms based on inexact implicit differentiation and we exploit a warm-start strategy to amortize the estimation of the exact gradient. We then introduce a unified theoretical framework inspired by the study of singularly perturbed systems (Habets, 1974) to analyze such amortized algorithms. By using this framework, our analysis shows these algorithms to match the computational complexity of oracle methods that have access to an unbiased estimate of the gradient, thus outperforming many existing results for bilevel optimization. We illustrate these findings on synthetic experiments and demonstrate the efficiency of these algorithms on hyper-parameter optimization experiments involving several thousands of variables.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes