LGJun 4, 2021

Debiasing a First-order Heuristic for Approximate Bi-level Optimization

arXiv:2106.02487v25 citations
AI Analysis

This addresses convergence issues in bi-level optimization for deep learning applications, though it is incremental as it builds on an existing heuristic.

The paper tackled the problem of gradient bias in a first-order heuristic for approximate bi-level optimization, which can cause non-convergence, and proposed an unbiased method (UFOM) with constant memory complexity and convergence guarantees.

Approximate bi-level optimization (ABLO) consists of (outer-level) optimization problems, involving numerical (inner-level) optimization loops. While ABLO has many applications across deep learning, it suffers from time and memory complexity proportional to the length $r$ of its inner optimization loop. To address this complexity, an earlier first-order method (FOM) was proposed as a heuristic that omits second derivative terms, yielding significant speed gains and requiring only constant memory. Despite FOM's popularity, there is a lack of theoretical understanding of its convergence properties. We contribute by theoretically characterizing FOM's gradient bias under mild assumptions. We further demonstrate a rich family of examples where FOM-based SGD does not converge to a stationary point of the ABLO objective. We address this concern by proposing an unbiased FOM (UFOM) enjoying constant memory complexity as a function of $r$. We characterize the introduced time-variance tradeoff, demonstrate convergence bounds, and find an optimal UFOM for a given ABLO problem. Finally, we propose an efficient adaptive UFOM scheme.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes