OCFeb 20, 2023
Model-based feature selection for neural networks: A mixed-integer programming approachShudian Zhao, Calvin Tsay, Jan Kronqvist
In this work, we develop a novel input feature selection framework for ReLU-based deep neural networks (DNNs), which builds upon a mixed-integer optimization approach. While the method is generally applicable to various classification tasks, we focus on finding input features for image classification for clarity of presentation. The idea is to use a trained DNN, or an ensemble of trained DNNs, to identify the salient input features. The input feature selection is formulated as a sequence of mixed-integer linear programming (MILP) problems that find sets of sparse inputs that maximize the classification confidence of each category. These ''inverse'' problems are regularized by the number of inputs selected for each category and by distribution constraints. Numerical results on the well-known MNIST and FashionMNIST datasets show that the proposed input feature selection allows us to drastically reduce the size of the input to $\sim$15\% while maintaining a good classification accuracy. This allows us to design DNNs with significantly fewer connections, reducing computational effort and producing DNNs that are more robust towards adversarial attacks.
LGSep 18, 2024
A constrained optimization approach to improve robustness of neural networksShudian Zhao, Jan Kronqvist
In this paper, we present a novel nonlinear programming-based approach to fine-tune pre-trained neural networks to improve robustness against adversarial attacks while maintaining high accuracy on clean data. Our method introduces adversary-correction constraints to ensure correct classification of adversarial data and minimizes changes to the model parameters. We propose an efficient cutting-plane-based algorithm to iteratively solve the large-scale nonconvex optimization problem by approximating the feasible region through polyhedral cuts and balancing between robustness and accuracy. Computational experiments on standard datasets such as MNIST and CIFAR10 demonstrate that the proposed approach significantly improves robustness, even with a very small set of adversarial data, while maintaining minimal impact on accuracy.
9.0SYMay 6
ADMM-based decomposed DNN+RLT Relaxations for Completely Positive Models in Electricity Market ClearingShudian Zhao, Mohammad Reza Karimi Gharigh, Jan Kronqvist et al.
The day-ahead electricity market clearing with nonconvex order types can be formulated as a mixed-integer linear program (MILP), but its LP relaxation may provide weak bounds, and exact solutions can become computationally intractable in large-scale or extended market settings. We study a welfare-maximizing clearing model with elementary hourly orders, block orders with logical acceptance constraints, and flexible hourly orders. Starting from a compact MILP formulation, we derive an equivalent completely positive programming (CPP) reformulation via matrix lifting and propose relaxed CPP variants that further reduce the modeling burden while maintaining strong bounds. We then develop tractable doubly nonnegative (DNN) relaxations, including decomposed formulations that exploit the problem structure by using smaller positive semidefinite matrices. To further strengthen these bounds, we introduce reformulation-linearization technique (RLT) inequalities tailored to the decomposed structure. To tackle the challenge of large-scale DNNs, we design an alternating direction method of multipliers (ADMM) with adaptive penalty updates and rigorous dual lower bounds, enabling certified early termination. Computational experiments on synthetic instances show that the proposed DNN+RLT relaxations substantially tighten LP bounds, while decomposition and first-order methods significantly reduce computational effort.