LGJan 26, 2023
Finding Regions of Counterfactual Explanations via Robust OptimizationDonato Maragno, Jannis Kurtz, Tabea E. Röber et al.
Counterfactual explanations play an important role in detecting bias and improving the explainability of data-driven classification models. A counterfactual explanation (CE) is a minimal perturbed data point for which the decision of the model changes. Most of the existing methods can only provide one CE, which may not be achievable for the user. In this work we derive an iterative method to calculate robust CEs, i.e. CEs that remain valid even after the features are slightly perturbed. To this end, our method provides a whole region of CEs allowing the user to choose a suitable recourse to obtain a desired outcome. We use algorithmic ideas from robust optimization and prove convergence results for the most common machine learning methods including logistic regression, decision trees, random forests, and neural networks. Our experiments show that our method can efficiently generate globally optimal robust CEs for a variety of common data sets and classification models.
OCOct 6, 2023Code
Deep Learning for Two-Stage Robust Integer OptimizationJustin Dumouchelle, Esther Julien, Jannis Kurtz et al.
Robust optimization is an established framework for modeling optimization problems with uncertain parameters. While static robust optimization is often criticized for being too conservative, two-stage (or adjustable) robust optimization (2RO) provides a less conservative alternative by allowing some decisions to be made after the uncertain parameters have been revealed. Unfortunately, in the case of integer decision variables, existing solution methods for 2RO typically fail to solve large-scale instances, limiting the applicability of this modeling paradigm to simple cases. We propose Neur2RO, a deep-learning-augmented instantiation of the column-and-constraint-generation (CCG) algorithm, which expands the applicability of the 2RO framework to large-scale instances with integer decisions in both stages. A custom-designed neural network is trained to estimate the optimal value and feasibility of the second-stage problem. The network can be incorporated into CCG, leading to more computationally tractable subproblems in each of its iterations. The resulting algorithm enjoys approximation guarantees which depend on the neural network's prediction error. In our experiments, Neur2RO produces high-quality solutions quickly, outperforming state-of-the-art methods on two-stage knapsack, capital budgeting, and facility location problems. Compared to existing methods, which often run for hours, Neur2RO finds better solutions in a few seconds or minutes. Our code is available at https://github.com/khalil-research/Neur2RO.
OCFeb 10
Linear Model Extraction via Factual and Counterfactual QueriesDaan Otto, Jannis Kurtz, Dick den Hertog et al.
In model extraction attacks, the goal is to reveal the parameters of a black-box machine learning model by querying the model for a selected set of data points. Due to an increasing demand for explanations, this may involve counterfactual queries besides the typically considered factual queries. In this work, we consider linear models and three types of queries: factual, counterfactual, and robust counterfactual. First, for an arbitrary set of queries, we derive novel mathematical formulations for the classification regions for which the decision of the unknown model is known, without recovering any of the model parameters. Second, we derive bounds on the number of queries needed to extract the model's parameters for (robust) counterfactual queries under arbitrary norm-based distances. We show that the full model can be recovered using just a single counterfactual query when differentiable distance measures are employed. In contrast, when using polyhedral distances for instance, the number of required queries grows linearly with the dimension of the data space. For robust counterfactuals, the latter number of queries doubles. Consequently, the applied distance function and robustness of counterfactuals have a significant impact on the model's security.
OCMar 26
Globalized Adversarial Regret Optimization: Robust Decisions with Uncalibrated PredictionsJannis Kurtz, Bart P. G. van Parys
Optimization problems routinely depend on uncertain parameters that must be predicted before a decision is made. Classical robust and regret formulations are designed to handle erroneous predictions and can provide statistical error bounds in simple settings. However, when predictions lack rigorous error bounds (as is typical of modern machine learning methods) classical robust models often yield vacuous guarantees, while regret formulations can paradoxically produce decisions that are more optimistic than even a nominal solution. We introduce Globalized Adversarial Regret Optimization (GARO), a decision framework that controls adversarial regret, defined as the gap between the worst-case cost and the oracle robust cost, uniformly across all possible uncertainty set sizes. By design, GARO delivers absolute or relative performance guarantees against an oracle with full knowledge of the prediction error, without requiring any probabilistic calibration of the uncertainty set. We show that GARO equipped with a relative rate function generalizes the classical adaptation method of Lepski to downstream decision problems. We derive exact tractable reformulations for problems with affine worst-case cost functions and polyhedral norm uncertainty sets, and provide a discretization and a constraint-generation algorithm with convergence guarantees for general settings. Finally, experiments demonstrate that GARO yields solutions with a more favorable trade-off between worst-case and mean out-of-sample performance, as well as stronger global performance guarantees.
LGMar 3, 2022
Ensemble Methods for Robust Support Vector Machines using Integer ProgrammingJannis Kurtz
In this work we study binary classification problems where we assume that our training data is subject to uncertainty, i.e. the precise data points are not known. To tackle this issue in the field of robust machine learning the aim is to develop models which are robust against small perturbations in the training data. We study robust support vector machines (SVM) and extend the classical approach by an ensemble method which iteratively solves a non-robust SVM on different perturbations of the dataset, where the perturbations are derived by an adversarial problem. Afterwards for classification of an unknown data point we perform a majority vote of all calculated SVM solutions. We study three different variants for the adversarial problem, the exact problem, a relaxed variant and an efficient heuristic variant. While the exact and the relaxed variant can be modeled using integer programming formulations, the heuristic one can be implemented by an easy and efficient algorithm. All derived methods are tested on random and realistic datasets and the results indicate that the derived ensemble methods have a much more stable behaviour when changing the protection level compared to the classical robust SVM model.
AIFeb 26, 2024
From Large Language Models and Optimization to Decision Optimization CoPilot: A Research ManifestoSegev Wasserkrug, Leonard Boussioux, Dick den Hertog et al.
Significantly simplifying the creation of optimization models for real-world business problems has long been a major goal in applying mathematical optimization more widely to important business and societal decisions. The recent capabilities of Large Language Models (LLMs) present a timely opportunity to achieve this goal. Therefore, we propose research at the intersection of LLMs and optimization to create a Decision Optimization CoPilot (DOCP) - an AI tool designed to assist any decision maker, interacting in natural language to grasp the business problem, subsequently formulating and solving the corresponding optimization model. This paper outlines our DOCP vision and identifies several fundamental requirements for its implementation. We describe the state of the art through a literature survey and experiments using ChatGPT. We show that a) LLMs already provide substantial novel capabilities relevant to a DOCP, and b) major research challenges remain to be addressed. We also propose possible research directions to overcome these gaps. We also see this work as a call to action to bring together the LLM and optimization communities to pursue our vision, thereby enabling much more widespread improved decision-making.
OCFeb 4, 2024
Neur2BiLO: Neural Bilevel OptimizationJustin Dumouchelle, Esther Julien, Jannis Kurtz et al.
Bilevel optimization deals with nested problems in which a leader takes the first decision to minimize their objective function while accounting for a follower's best-response reaction. Constrained bilevel problems with integer variables are particularly notorious for their hardness. While exact solvers have been proposed for mixed-integer linear bilevel optimization, they tend to scale poorly with problem size and are hard to generalize to the non-linear case. On the other hand, problem-specific algorithms (exact and heuristic) are limited in scope. Under a data-driven setting in which similar instances of a bilevel problem are solved routinely, our proposed framework, Neur2BiLO, embeds a neural network approximation of the leader's or follower's value function, trained via supervised regression, into an easy-to-solve mixed-integer program. Neur2BiLO serves as a heuristic that produces high-quality solutions extremely fast for four applications with linear and non-linear objectives and pure and mixed-integer variables.
OCMay 24, 2024
Counterfactual Explanations for Linear OptimizationJannis Kurtz, Ş. İlker Birbil, Dick den Hertog
The concept of counterfactual explanations (CE) has emerged as one of the important concepts to understand the inner workings of complex AI systems. In this paper, we translate the idea of CEs to linear optimization and propose, motivate, and analyze three different types of CEs: strong, weak, and relative. While deriving strong and weak CEs appears to be computationally intractable, we show that calculating relative CEs can be done efficiently. By detecting and exploiting the hidden convex structure of the optimization problem that arises in the latter case, we show that obtaining relative CEs can be done in the same magnitude of time as solving the original linear optimization problem. This is confirmed by an extensive numerical experiment study on the NETLIB library.
OCFeb 7, 2025
Coherent Local Explanations for Mathematical OptimizationDaan Otto, Jannis Kurtz, S. Ilker Birbil
The surge of explainable artificial intelligence methods seeks to enhance transparency and explainability in machine learning models. At the same time, there is a growing demand for explaining decisions taken through complex algorithms used in mathematical optimization. However, current explanation methods do not take into account the structure of the underlying optimization problem, leading to unreliable outcomes. In response to this need, we introduce Coherent Local Explanations for Mathematical Optimization (CLEMO). CLEMO provides explanations for multiple components of optimization models, the objective value and decision variables, which are coherent with the underlying model structure. Our sampling-based procedure can provide explanations for the behavior of exact and heuristic solution algorithms. The effectiveness of CLEMO is illustrated by experiments for the shortest path problem, the knapsack problem, and the vehicle routing problem.
OCMar 30, 2022
Data-driven Prediction of Relevant Scenarios for Robust Combinatorial OptimizationMarc Goerigk, Jannis Kurtz
We study iterative methods for (two-stage) robust combinatorial optimization problems with discrete uncertainty. We propose a machine-learning-based heuristic to determine starting scenarios that provide strong lower bounds. To this end, we design dimension-independent features and train a Random Forest Classifier on small-dimensional instances. Experiments show that our method improves the solution process for larger instances than contained in the training set and also provides a feature importance-score which gives insights into the role of scenario properties.
OCOct 21, 2021
Efficient and Robust Mixed-Integer Optimization Methods for Training Binarized Deep Neural NetworksJannis Kurtz, Bubacarr Bah
Compared to classical deep neural networks its binarized versions can be useful for applications on resource-limited devices due to their reduction in memory consumption and computational demands. In this work we study deep neural networks with binary activation functions and continuous or integer weights (BDNN). We show that the BDNN can be reformulated as a mixed-integer linear program with bounded weight space which can be solved to global optimality by classical mixed-integer programming solvers. Additionally, a local search heuristic is presented to calculate locally optimal networks. Furthermore to improve efficiency we present an iterative data-splitting heuristic which iteratively splits the training set into smaller subsets by using the k-mean method. Afterwards all data points in a given subset are forced to follow the same activation pattern, which leads to a much smaller number of integer variables in the mixed-integer programming formulation and therefore to computational improvements. Finally for the first time a robust model is presented which enforces robustness of the BDNN during training. All methods are tested on random and real datasets and our results indicate that all models can often compete with or even outperform classical DNNs on small network architectures confirming the viability for applications having restricted memory or computing power.
OCNov 19, 2020
Data-Driven Robust Optimization using Unsupervised Deep LearningMarc Goerigk, Jannis Kurtz
Robust optimization has been established as a leading methodology to approach decision problems under uncertainty. To derive a robust optimization model, a central ingredient is to identify a suitable model for uncertainty, which is called the uncertainty set. An ongoing challenge in the recent literature is to derive uncertainty sets from given historical data that result in solutions that are robust regarding future scenarios. In this paper we use an unsupervised deep learning method to learn and extract hidden structures from data, leading to non-convex uncertainty sets and better robust solutions. We prove that most of the classical uncertainty classes are special cases of our derived sets and that optimizing over them is strongly NP-hard. Nevertheless, we show that the trained neural networks can be integrated into a robust optimization model by formulating the adversarial problem as a convex quadratic mixed-integer program. This allows us to derive robust solutions through an iterative scenario generation process. In our computational experiments, we compare this approach to a similar approach using kernel-based support vector clustering. We find that uncertainty sets derived by the unsupervised deep learning method find a better description of data and lead to robust solutions that outperform the comparison method both with respect to objective value and feasibility.
OCJul 7, 2020
An Integer Programming Approach to Deep Neural Networks with Binary Activation FunctionsBubacarr Bah, Jannis Kurtz
We study deep neural networks with binary activation functions (BDNN), i.e. the activation function only has two states. We show that the BDNN can be reformulated as a mixed-integer linear program which can be solved to global optimality by classical integer programming solvers. Additionally, a heuristic solution algorithm is presented and we study the model under data uncertainty, applying a two-stage robust optimization approach. We implemented our methods on random and real datasets and show that the heuristic version of the BDNN outperforms classical deep neural networks on the Breast Cancer Wisconsin dataset while performing worse on random data.