LGFeb 3, 2023

Efficient Gradient Approximation Method for Constrained Bilevel Optimization

arXiv:2302.01970v130 citationsh-index: 26
Originality Incremental advance
AI Analysis

This work addresses constrained bilevel optimization for machine learning tasks like hyperparameter tuning, but it is incremental as it builds on existing gradient-based methods with a new approximation technique.

The paper tackles constrained bilevel optimization problems with non-convex and non-differentiable objectives by developing a gradient approximation method that computes representative gradients in a neighborhood. The algorithm asymptotically converges to Clarke stationary points and is demonstrated effective in hyperparameter optimization and meta-learning experiments.

Bilevel optimization has been developed for many machine learning tasks with large-scale and high-dimensional data. This paper considers a constrained bilevel optimization problem, where the lower-level optimization problem is convex with equality and inequality constraints and the upper-level optimization problem is non-convex. The overall objective function is non-convex and non-differentiable. To solve the problem, we develop a gradient-based approach, called gradient approximation method, which determines the descent direction by computing several representative gradients of the objective function inside a neighborhood of the current estimate. We show that the algorithm asymptotically converges to the set of Clarke stationary points, and demonstrate the efficacy of the algorithm by the experiments on hyperparameter optimization and meta-learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes