OC LGFeb 5, 2025

An analysis of optimization problems involving ReLU neural networks

Christoph Plate, Mirko Hahn, Alexander Klimek, Caroline Ganzer, Kai Sundmacher, Sebastian Sager

arXiv:2502.03016v111.35 citationsh-index: 6

Originality Incremental advance

AI Analysis

This work addresses computational bottlenecks for researchers and practitioners in optimization and machine learning dealing with neural network-integrated problems, though it is incremental as it builds on existing methods.

The paper tackles the challenge of solving mixed-integer optimization problems with embedded ReLU neural networks, where Big-M coefficients grow exponentially with layers, and proposes approaches like clipped variants and regularization to improve solver runtime, quantifying a trade-off between model redundancy and computational costs in benchmark tests.

Solving mixed-integer optimization problems with embedded neural networks with ReLU activation functions is challenging. Big-M coefficients that arise in relaxing binary decisions related to these functions grow exponentially with the number of layers. We survey and propose different approaches to analyze and improve the run time behavior of mixed-integer programming solvers in this context. Among them are clipped variants and regularization techniques applied during training as well as optimization-based bound tightening and a novel scaling for given ReLU networks. We numerically compare these approaches for three benchmark problems from the literature. We use the number of linear regions, the percentage of stable neurons, and overall computational effort as indicators. As a major takeaway we observe and quantify a trade-off between the often desired redundancy of neural network models versus the computational costs for solving related optimization problems.

View on arXiv PDF

Similar