LGJul 8, 2024

LPGD: A General Framework for Backpropagation through Embedded Optimization Layers

arXiv:2407.05920v15 citationsh-index: 8
Originality Incremental advance
AI Analysis

This addresses a key bottleneck for researchers and practitioners using optimization layers in ML, offering a flexible and theoretically grounded solution, though it builds on prior methods.

The paper tackles the problem of training machine learning architectures with embedded optimization layers, where degenerate derivatives make gradients uninformative, by proposing the Lagrangian Proximal Gradient Descent (LPGD) framework, which efficiently computes meaningful gradient replacements and is shown to converge faster than gradient descent in experiments.

Embedding parameterized optimization problems as layers into machine learning architectures serves as a powerful inductive bias. Training such architectures with stochastic gradient descent requires care, as degenerate derivatives of the embedded optimization problem often render the gradients uninformative. We propose Lagrangian Proximal Gradient Descent (LPGD) a flexible framework for training architectures with embedded optimization layers that seamlessly integrates into automatic differentiation libraries. LPGD efficiently computes meaningful replacements of the degenerate optimization layer derivatives by re-running the forward solver oracle on a perturbed input. LPGD captures various previously proposed methods as special cases, while fostering deep links to traditional optimization methods. We theoretically analyze our method and demonstrate on historical and synthetic data that LPGD converges faster than gradient descent even in a differentiable setup.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes