LGOCMLFeb 20, 2020

Learning with Differentiable Perturbed Optimizers

arXiv:2002.08676v2131 citations
Originality Highly original
AI Analysis

This addresses a fundamental bottleneck for researchers and practitioners in machine learning who need to incorporate discrete decisions into differentiable models, enabling broader application of end-to-end learning.

The paper tackles the problem of integrating discrete optimization steps (e.g., sorting, shortest paths) into end-to-end machine learning pipelines, which typically break back-propagation due to non-differentiability, by proposing a method using stochastically perturbed optimizers to make them differentiable and tunable via noise amplitude, with experimental demonstrations on various tasks.

Machine learning pipelines often rely on optimization procedures to make discrete decisions (e.g., sorting, picking closest neighbors, or shortest paths). Although these discrete decisions are easily computed, they break the back-propagation of computational graphs. In order to expand the scope of learning problems that can be solved in an end-to-end fashion, we propose a systematic method to transform optimizers into operations that are differentiable and never locally constant. Our approach relies on stochastically perturbed optimizers, and can be used readily together with existing solvers. Their derivatives can be evaluated efficiently, and smoothness tuned via the chosen noise amplitude. We also show how this framework can be connected to a family of losses developed in structured prediction, and give theoretical guarantees for their use in learning tasks. We demonstrate experimentally the performance of our approach on various tasks.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes