Differentiating Through Integer Linear Programs with Quadratic Regularization and Davis-Yin Splitting
This work addresses the problem of integrating combinatorial optimization with neural network training for researchers and practitioners in machine learning and optimization, though it is incremental as it builds on prior techniques like JFB and DYS.
The paper tackles the challenge of training neural networks to predict parameters for combinatorial optimization problems, specifically Integer Linear Programs (ILPs), by proposing a method that combines Davis-Yin splitting for forward passes and Jacobian-free backpropagation for backward passes, resulting in a scheme that scales more effectively to high-dimensional problems than existing methods, as demonstrated on shortest path and knapsack problems.
In many applications, a combinatorial problem must be repeatedly solved with similar, but distinct parameters. Yet, the parameters $w$ are not directly observed; only contextual data $d$ that correlates with $w$ is available. It is tempting to use a neural network to predict $w$ given $d$. However, training such a model requires reconciling the discrete nature of combinatorial optimization with the gradient-based frameworks used to train neural networks. We study the case where the problem in question is an Integer Linear Program (ILP). We propose applying a three-operator splitting technique, also known as Davis-Yin splitting (DYS), to the quadratically regularized continuous relaxation of the ILP. We prove that the resulting scheme is compatible with the recently introduced Jacobian-free backpropagation (JFB). Our experiments on two representative ILPs: the shortest path problem and the knapsack problem, demonstrate that this combination-DYS on the forward pass, JFB on the backward pass-yields a scheme which scales more effectively to high-dimensional problems than existing schemes. All code associated with this paper is available at github.com/mines-opt-ml/fpo-dys.