LGOCMay 22, 2023

Effective Bilevel Optimization via Minimax Reformulation

arXiv:2305.13153v43 citations
AI Analysis

This addresses a bottleneck for researchers and practitioners using bilevel optimization in large-scale applications like hyper-parameter tuning and meta-learning, offering a more efficient solution.

The paper tackles the high computational cost of bilevel optimization in machine learning by reformulating it as a minimax problem, showing that their method outperforms state-of-the-art approaches with reduced computational expense.

Bilevel optimization has found successful applications in various machine learning problems, including hyper-parameter optimization, data cleaning, and meta-learning. However, its huge computational cost presents a significant challenge for its utilization in large-scale problems. This challenge arises due to the nested structure of the bilevel formulation, where each hyper-gradient computation necessitates a costly inner optimization procedure. To address this issue, we propose a reformulation of bilevel optimization as a minimax problem, effectively decoupling the outer-inner dependency. Under mild conditions, we show these two problems are equivalent. Furthermore, we introduce a multi-stage gradient descent and ascent (GDA) algorithm to solve the resulting minimax problem with convergence guarantees. Extensive experimental results demonstrate that our method outperforms state-of-the-art bilevel methods while significantly reducing the computational cost.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes