LGMay 30, 2022

Backpropagation through Combinatorial Algorithms: Identity with Projection Works

arXiv:2205.15213v337 citationsh-index: 33
Originality Incremental advance
AI Analysis

This addresses the challenge of integrating combinatorial reasoning into neural networks for researchers in machine learning, offering a simpler alternative to existing methods, though it appears incremental as it builds on prior smoothing and relaxation techniques.

The paper tackles the problem of embedding discrete solvers as differentiable layers in deep learning, where gradients are zero or undefined, by proposing a method that treats the solver as a negative identity on the backward pass with a theoretical justification. Experiments show this hyper-parameter-free approach competes with prior complex methods on tasks like discrete samplers, graph matching, and image retrieval, and introduces a generic regularization to prevent cost collapse.

Embedding discrete solvers as differentiable layers has given modern deep learning architectures combinatorial expressivity and discrete reasoning capabilities. The derivative of these solvers is zero or undefined, therefore a meaningful replacement is crucial for effective gradient-based learning. Prior works rely on smoothing the solver with input perturbations, relaxing the solver to continuous problems, or interpolating the loss landscape with techniques that typically require additional solver calls, introduce extra hyper-parameters, or compromise performance. We propose a principled approach to exploit the geometry of the discrete solution space to treat the solver as a negative identity on the backward pass and further provide a theoretical justification. Our experiments demonstrate that such a straightforward hyper-parameter-free approach is able to compete with previous more complex methods on numerous experiments such as backpropagation through discrete samplers, deep graph matching, and image retrieval. Furthermore, we substitute the previously proposed problem-specific and label-dependent margin with a generic regularization procedure that prevents cost collapse and increases robustness.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes