Mahyar Fazlyab

LG
h-index18
28papers
1,201citations
Novelty55%
AI Score55

28 Papers

97.1SYMay 29
Generalized Model Predictive Path Integral Control as Expectation--Maximization

Jiarui Wang, Sina Sharifi, Mahyar Fazlyab

Model Predictive Path Integral (MPPI) control is a powerful sampling-based method for solving stochastic optimal control problems and has enabled real-time control in complex robotic systems. Despite its empirical success, its theoretical understanding remains limited. In this work, we show that MPPI can be interpreted as a special case of the Expectation-Maximization (EM) algorithm applied to a probabilistic inference formulation of optimal control. This perspective leads to a generalized EM-MPPI framework that extends MPPI beyond the commonly used Gaussian parameterization. We analyze the convergence behavior of this algorithm and characterize the local convergence rate in terms of the covariance of the posterior trajectory distribution and the exploration distribution. For exponential-family distributions, we establish a sufficient increase property of the log-likelihood when the log-partition function is strongly convex. Specializing the analysis to Gaussian MPPI yields explicit global and local convergence characterizations. The code for the experiments will be available upon acceptance.

LGJan 27, 2023
Certified Invertibility in Neural Networks via Mixed-Integer Programming

Tianqi Cui, Thomas Bertalan, George J. Pappas et al.

Neural networks are known to be vulnerable to adversarial attacks, which are small, imperceptible perturbations that can significantly alter the network's output. Conversely, there may exist large, meaningful perturbations that do not affect the network's decision (excessive invariance). In our research, we investigate this latter phenomenon in two contexts: (a) discrete-time dynamical system identification, and (b) the calibration of a neural network's output to that of another network. We examine noninvertibility through the lens of mathematical optimization, where the global solution measures the ``safety" of the network predictions by their distance from the non-invertibility boundary. We formulate mixed-integer programs (MIPs) for ReLU networks and $L_p$ norms ($p=1,2,\infty$) that apply to neural network approximators of dynamical systems. We also discuss how our findings can be useful for invertibility certification in transformations between neural networks, e.g. between different levels of network pruning.

LGSep 29, 2023
Certified Robustness via Dynamic Margin Maximization and Improved Lipschitz Regularization

Mahyar Fazlyab, Taha Entesari, Aniket Roy et al.

To improve the robustness of deep classifiers against adversarial perturbations, many approaches have been proposed, such as designing new architectures with better robustness properties (e.g., Lipschitz-capped networks), or modifying the training process itself (e.g., min-max optimization, constrained learning, or regularization). These approaches, however, might not be effective at increasing the margin in the input (feature) space. As a result, there has been an increasing interest in developing training procedures that can directly manipulate the decision boundary in the input space. In this paper, we build upon recent developments in this category by developing a robust training algorithm whose objective is to increase the margin in the output (logit) space while regularizing the Lipschitz constant of the model along vulnerable directions. We show that these two objectives can directly promote larger margins in the input space. To this end, we develop a scalable method for calculating guaranteed differentiable upper bounds on the Lipschitz constant of neural networks accurately and efficiently. The relative accuracy of the bounds prevents excessive regularization and allows for more direct manipulation of the decision boundary. Furthermore, our Lipschitz bounding algorithm exploits the monotonicity and Lipschitz continuity of the activation layers, and the resulting bounds can be used to design new layers with controllable bounds on their Lipschitz constant. Experiments on the MNIST, CIFAR-10, and Tiny-ImageNet data sets verify that our proposed algorithm obtains competitively improved results compared to the state-of-the-art.

SYDec 14, 2022
Automated Reachability Analysis of Neural Network-Controlled Systems via Adaptive Polytopes

Taha Entesari, Mahyar Fazlyab

Over-approximating the reachable sets of dynamical systems is a fundamental problem in safety verification and robust control synthesis. The representation of these sets is a key factor that affects the computational complexity and the approximation error. In this paper, we develop a new approach for over-approximating the reachable sets of neural network dynamical systems using adaptive template polytopes. We use the singular value decomposition of linear layers along with the shape of the activation functions to adapt the geometry of the polytopes at each time step to the geometry of the true reachable sets. We then propose a branch-and-bound method to compute accurate over-approximations of the reachable sets by the inferred templates. We illustrate the utility of the proposed approach in the reachability analysis of linear systems driven by neural network controllers.

OCJul 18, 2022
Towards Understanding The Semidefinite Relaxations of Truncated Least-Squares in Robust Rotation Search

Liangzu Peng, Mahyar Fazlyab, René Vidal

The rotation search problem aims to find a 3D rotation that best aligns a given number of point pairs. To induce robustness against outliers for rotation search, prior work considers truncated least-squares (TLS), which is a non-convex optimization problem, and its semidefinite relaxation (SDR) as a tractable alternative. Whether this SDR is theoretically tight in the presence of noise, outliers, or both has remained largely unexplored. We derive conditions that characterize the tightness of this SDR, showing that the tightness depends on the noise level, the truncation parameters of TLS, and the outlier distribution (random or clustered). In particular, we give a short proof for the tightness in the noiseless and outlier-free case, as opposed to the lengthy analysis of prior work.

66.7OCMar 25
Model Predictive Path Integral Control as Preconditioned Gradient Descent

Mahyar Fazlyab, Sina Sharifi, Jiarui Wang

Model Predictive Path Integral (MPPI) control is a popular sampling-based method for trajectory optimization in nonlinear and nonconvex settings, yet its optimization structure remains only partially understood. We develop a variational, optimization-theoretic interpretation of MPPI by lifting constrained trajectory optimization to a KL-regularized problem over distributions and reducing it to a negative log-partition (free-energy) objective over a tractable sampling family. For a general parametric family, this yields a preconditioned gradient method on the distribution parameters and a natural multi-step extension of MPPI. For the fixed-covariance Gaussian family, we show that classical MPPI is recovered exactly as a preconditioned gradient descent step with unit step size. This interpretation enables a direct convergence analysis: under bounded feasible sets, we derive an explicit upper bound on the smoothness constant and a simple sufficient condition guaranteeing descent of exact MPPI. Numerical experiments support the theory and illustrate the effect of key hyperparameters on performance.

CVApr 18, 2024Code
Gradient-Regularized Out-of-Distribution Detection

Sina Sharifi, Taha Entesari, Bardia Safaei et al.

One of the challenges for neural networks in real-life applications is the overconfident errors these models make when the data is not from the original training distribution. Addressing this issue is known as Out-of-Distribution (OOD) detection. Many state-of-the-art OOD methods employ an auxiliary dataset as a surrogate for OOD data during training to achieve improved performance. However, these methods fail to fully exploit the local information embedded in the auxiliary dataset. In this work, we propose the idea of leveraging the information embedded in the gradient of the loss function during training to enable the network to not only learn a desired OOD score for each sample but also to exhibit similar behavior in a local neighborhood around each sample. We also develop a novel energy-based sampling method to allow the network to be exposed to more informative OOD samples during the training phase. This is especially important when the auxiliary dataset is large. We demonstrate the effectiveness of our method through extensive experiments on several OOD benchmarks, improving the existing state-of-the-art FPR95 by 4% on our ImageNet experiment. We further provide a theoretical analysis through the lens of certified robustness and Lipschitz analysis to showcase the theoretical foundation of our work. Our code is available at https://github.com/o4lc/Greg-OOD.

59.6LGMay 11
Hierarchical End-to-End Taylor Bounds for Complete Neural Network Verification

Taha Entesari, Mahyar Fazlyab

Reachability analysis of neural networks, which seeks to compute or bound the set of outputs attainable over a given input domain, is central to certifying safety and robustness in learning-enabled physical systems. Since exact reachable set computation is generally intractable, existing methods typically rely on tractable overapproximations. Examining the state of the art for smooth, twice-differentiable networks, we observe that existing approaches exploit at most second-order information and do not systematically leverage higher-order information. In this work, we introduce \textsc{HiTaB}, a novel verification framework that exploits second-order smoothness through both the Hessian, $\nabla^2 f$, and its Lipschitz constant, $L_{\nabla^2 f}$. We further develop a unified hierarchy of zeroth-, first-, and second-order bounds, together with precise conditions under which higher-order approximations yield provable improvements. Our main technical contribution is a compositional procedure for efficiently bounding $L_{\nabla^2 f}$ in deep neural networks via layerwise propagation of curvature bounds. We extend the framework to both $\ell_2$- and $\ell_\infty$-constrained input sets and show how it can be integrated into branch-and-bound verification pipelines. To our knowledge, this is the first practical reachability analysis framework for smooth neural networks that systematically exploits Lipschitz continuity of curvature, leading to tighter and more informative safety certificates.

LGMar 13, 2024
Actor-Critic Physics-informed Neural Lyapunov Control

Jiarui Wang, Mahyar Fazlyab

Designing control policies for stabilization tasks with provable guarantees is a long-standing problem in nonlinear control. A crucial performance metric is the size of the resulting region of attraction, which essentially serves as a robustness "margin" of the closed-loop system against uncertainties. In this paper, we propose a new method to train a stabilizing neural network controller along with its corresponding Lyapunov certificate, aiming to maximize the resulting region of attraction while respecting the actuation constraints. Crucial to our approach is the use of Zubov's Partial Differential Equation (PDE), which precisely characterizes the true region of attraction of a given control policy. Our framework follows an actor-critic pattern where we alternate between improving the control policy (actor) and learning a Zubov function (critic). Finally, we compute the largest certifiable region of attraction by invoking an SMT solver after the training procedure. Our numerical experiments on several design problems show consistent and significant improvements in the size of the resulting region of attraction.

LGJan 11, 2024
Learning Performance-Oriented Control Barrier Functions Under Complex Safety Constraints and Limited Actuation

Lakshmideepakreddy Manda, Shaoru Chen, Mahyar Fazlyab

Control Barrier Functions (CBFs) provide an elegant framework for constraining nonlinear control system dynamics to remain within an invariant subset of a designated safe set. However, identifying a CBF that balances performance-by maximizing the control invariant set-and accommodates complex safety constraints, especially in systems with high relative degree and actuation limits, poses a significant challenge. In this work, we introduce a novel self-supervised learning framework to comprehensively address these challenges. Our method begins with a Boolean composition of multiple state constraints that define the safe set. We first construct a smooth function whose zero superlevel set forms an inner approximation of this safe set. This function is then combined with a smooth neural network to parameterize the CBF candidate. To train the CBF and maximize the volume of the resulting control invariant set, we design a physics-informed loss function based on a Hamilton-Jacobi Partial Differential Equation (PDE). We validate the efficacy of our approach on a 2D double integrator (DI) system and a 7D fixed-wing aircraft system (F16).

LGMar 12, 2024
Verification-Aided Learning of Neural Network Barrier Functions with Termination Guarantees

Shaoru Chen, Lekan Molu, Mahyar Fazlyab

Barrier functions are a general framework for establishing a safety guarantee for a system. However, there is no general method for finding these functions. To address this shortcoming, recent approaches use self-supervised learning techniques to learn these functions using training data that are periodically generated by a verification procedure, leading to a verification-aided learning framework. Despite its immense potential in automating barrier function synthesis, the verification-aided learning framework does not have termination guarantees and may suffer from a low success rate of finding a valid barrier function in practice. In this paper, we propose a holistic approach to address these drawbacks. With a convex formulation of the barrier function synthesis, we propose to first learn an empirically well-behaved NN basis function and then apply a fine-tuning algorithm that exploits the convexity and counterexamples from the verification failure to find a valid barrier function with finite-step termination guarantees: if there exist valid barrier functions, the fine-tuning algorithm is guaranteed to find one in a finite number of iterations. We demonstrate that our fine-tuning method can significantly boost the performance of the verification-aided learning framework on examples of different scales and using various neural network verifiers.

OCJan 27, 2025
Safe Gradient Flow for Bilevel Optimization

Sina Sharifi, Nazanin Abolfazli, Erfan Yazdandoost Hamedani et al.

Bilevel optimization is a key framework in hierarchical decision-making, where one problem is embedded within the constraints of another. In this work, we propose a control-theoretic approach to solving bilevel optimization problems. Our method consists of two components: a gradient flow mechanism to minimize the upper-level objective and a safety filter to enforce the constraints imposed by the lower-level problem. Together, these components form a safe gradient flow that solves the bilevel problem in a single loop. To improve scalability with respect to the lower-level problem's dimensions, we introduce a relaxed formulation and design a compact variant of the safe gradient flow. This variant minimizes the upper-level objective while ensuring the lower-level decision variable remains within a user-defined suboptimality. Using Lyapunov analysis, we establish convergence guarantees for the dynamics, proving that they converge to a neighborhood of the optimal solution. Numerical experiments further validate the effectiveness of the proposed approaches. Our contributions provide both theoretical insights and practical tools for efficiently solving bilevel optimization problems.

SYOct 18, 2024
Domain Adaptive Safety Filters via Deep Operator Learning

Lakshmideepakreddy Manda, Shaoru Chen, Mahyar Fazlyab

Learning-based approaches for constructing Control Barrier Functions (CBFs) are increasingly being explored for safety-critical control systems. However, these methods typically require complete retraining when applied to unseen environments, limiting their adaptability. To address this, we propose a self-supervised deep operator learning framework that learns the mapping from environmental parameters to the corresponding CBF, rather than learning the CBF directly. Our approach leverages the residual of a parametric Partial Differential Equation (PDE), where the solution defines a parametric CBF approximating the maximal control invariant set. This framework accommodates complex safety constraints, higher relative degrees, and actuation limits. We demonstrate the effectiveness of the method through numerical experiments on navigation tasks involving dynamic obstacles.

CLJun 5, 2025
Constrained Entropic Unlearning: A Primal-Dual Framework for Large Language Models

Taha Entesari, Arman Hatami, Rinat Khaziev et al.

Large Language Models (LLMs) deployed in real-world settings increasingly face the need to unlearn sensitive, outdated, or proprietary information. Existing unlearning methods typically formulate forgetting and retention as a regularized trade-off, combining both objectives into a single scalarized loss. This often leads to unstable optimization and degraded performance on retained data, especially under aggressive forgetting. We propose a new formulation of LLM unlearning as a constrained optimization problem: forgetting is enforced via a novel logit-margin flattening loss that explicitly drives the output distribution toward uniformity on a designated forget set, while retention is preserved through a hard constraint on a separate retain set. Compared to entropy-based objectives, our loss is softmax-free, numerically stable, and maintains non-vanishing gradients, enabling more efficient and robust optimization. We solve the constrained problem using a scalable primal-dual algorithm that exposes the trade-off between forgetting and retention through the dynamics of the dual variable, all without any extra computational overhead. Evaluations on the TOFU and MUSE benchmarks across diverse LLM architectures demonstrate that our approach consistently matches or exceeds state-of-the-art baselines, effectively removing targeted information while preserving downstream utility.

OCMay 20, 2025
Sequential QCQP for Bilevel Optimization with Line Search

Sina Sharifi, Erfan Yazdandoost Hamedani, Mahyar Fazlyab

Bilevel optimization involves a hierarchical structure where one problem is nested within another, leading to complex interdependencies between levels. We propose a single-loop, tuning-free algorithm that guarantees anytime feasibility, i.e., approximate satisfaction of the lower-level optimality condition, while ensuring descent of the upper-level objective. At each iteration, a convex quadratically-constrained quadratic program (QCQP) with a closed-form solution yields the search direction, followed by a backtracking line search inspired by control barrier functions to ensure safe, uniformly positive step sizes. The resulting method is scalable, requires no hyperparameter tuning, and converges under mild local regularity assumptions. We establish an O(1/k) ergodic convergence rate in terms of a first-order stationary metric and demonstrate the algorithm's effectiveness on representative bilevel tasks.

LGJun 7, 2024
Compositional Curvature Bounds for Deep Neural Networks

Taha Entesari, Sina Sharifi, Mahyar Fazlyab

A key challenge that threatens the widespread use of neural networks in safety-critical applications is their vulnerability to adversarial attacks. In this paper, we study the second-order behavior of continuously differentiable deep neural networks, focusing on robustness against adversarial perturbations. First, we provide a theoretical analysis of robustness and attack certificates for deep classifiers by leveraging local gradients and upper bounds on the second derivative (curvature constant). Next, we introduce a novel algorithm to analytically compute provable upper bounds on the second derivative of neural networks. This algorithm leverages the compositional structure of the model to propagate the curvature bound layer-by-layer, giving rise to a scalable and modular approach. The proposed bound can serve as a differentiable regularizer to control the curvature of neural networks during training, thereby enhancing robustness. Finally, we demonstrate the efficacy of our method on classification tasks using the MNIST and CIFAR-10 datasets.

LGJun 6, 2024
Provable Bounds on the Hessian of Neural Networks: Derivative-Preserving Reachability Analysis

Sina Sharifi, Mahyar Fazlyab

We propose a novel reachability analysis method tailored for neural networks with differentiable activations. Our idea hinges on a sound abstraction of the neural network map based on first-order Taylor expansion and bounding the remainder. To this end, we propose a method to compute analytical bounds on the network's first derivative (gradient) and second derivative (Hessian). A key aspect of our method is loop transformation on the activation functions to exploit their monotonicity effectively. The resulting end-to-end abstraction locally preserves the derivative information, yielding accurate bounds on small input sets. Finally, we employ a branch and bound framework for larger input sets to refine the abstraction recursively. We evaluate our method numerically via different examples and compare the results with relevant state-of-the-art methods.

LGJun 16, 2021
DeepSplit: Scalable Verification of Deep Neural Networks via Operator Splitting

Shaoru Chen, Eric Wong, J. Zico Kolter et al.

Analyzing the worst-case performance of deep neural networks against input perturbations amounts to solving a large-scale non-convex optimization problem, for which several past works have proposed convex relaxations as a promising alternative. However, even for reasonably-sized neural networks, these relaxations are not tractable, and so must be replaced by even weaker relaxations in practice. In this work, we propose a novel operator splitting method that can directly solve a convex relaxation of the problem to high accuracy, by splitting it into smaller sub-problems that often have analytical solutions. The method is modular, scales to very large problem instances, and compromises operations that are amenable to fast parallelization with GPU acceleration. We demonstrate our method in bounding the worst-case performance of large convolutional networks in image classification and reinforcement learning settings, and in reachability analysis of neural network dynamical systems.

OCMay 29, 2021
On Centralized and Distributed Mirror Descent: Convergence Analysis Using Quadratic Constraints

Youbang Sun, Mahyar Fazlyab, Shahin Shahrampour

Mirror descent (MD) is a powerful first-order optimization technique that subsumes several optimization algorithms including gradient descent (GD). In this work, we develop a semi-definite programming (SDP) framework to analyze the convergence rate of MD in centralized and distributed settings under both strongly convex and non-strongly convex assumptions. We view MD with a dynamical system lens and leverage quadratic constraints (QCs) to provide explicit convergence rates based on Lyapunov stability. For centralized MD under strongly convex assumption, we develop a SDP that certifies exponential convergence rates. We prove that the SDP always has a feasible solution that recovers the optimal GD rate as a special case. We complement our analysis by providing the $O(1/k)$ convergence rate for convex problems. Next, we analyze the convergence of distributed MD and characterize the rate using SDP. To the best of our knowledge, the numerical rate of distributed MD has not been previously reported in the literature. We further prove an $O(1/k)$ convergence rate for distributed MD in the convex setting. Our numerical experiments on strongly convex problems indicate that our framework certifies superior convergence rates compared to the existing rates for distributed GD.

LGMar 22, 2021
Performance Bounds for Neural Network Estimators: Applications in Fault Detection

Navid Hashemi, Mahyar Fazlyab, Justin Ruths

We exploit recent results in quantifying the robustness of neural networks to input variations to construct and tune a model-based anomaly detector, where the data-driven estimator model is provided by an autoregressive neural network. In tuning, we specifically provide upper bounds on the rate of false alarms expected under normal operation. To accomplish this, we provide a theory extension to allow for the propagation of multiple confidence ellipsoids through a neural network. The ellipsoid that bounds the output of the neural network under the input variation informs the sensitivity - and thus the threshold tuning - of the detector. We demonstrate this approach on a linear and nonlinear dynamical system.

LGDec 10, 2020
Certifying Incremental Quadratic Constraints for Neural Networks via Convex Optimization

Navid Hashemi, Justin Ruths, Mahyar Fazlyab

Abstracting neural networks with constraints they impose on their inputs and outputs can be very useful in the analysis of neural network classifiers and to derive optimization-based algorithms for certification of stability and robustness of feedback systems involving neural networks. In this paper, we propose a convex program, in the form of a Linear Matrix Inequality (LMI), to certify incremental quadratic constraints on the map of neural networks over a region of interest. These certificates can capture several useful properties such as (local) Lipschitz continuity, one-sided Lipschitz continuity, invertibility, and contraction. We illustrate the utility of our approach in two different settings. First, we develop a semidefinite program to compute guaranteed and sharp upper bounds on the local Lipschitz constant of neural networks and illustrate the results on random networks as well as networks trained on MNIST. Second, we consider a linear time-invariant system in feedback with an approximate model predictive controller parameterized by a neural network. We then turn the stability analysis into a semidefinite feasibility program and estimate an ellipsoidal invariant set for the closed-loop system.

LGNov 16, 2020
Enforcing robust control guarantees within neural network policies

Priya L. Donti, Melrose Roderick, Mahyar Fazlyab et al.

When designing controllers for safety-critical systems, practitioners often face a challenging tradeoff between robustness and performance. While robust control methods provide rigorous guarantees on system stability under certain worst-case disturbances, they often yield simple controllers that perform poorly in the average (non-worst) case. In contrast, nonlinear control methods trained using deep learning have achieved state-of-the-art performance on many control tasks, but often lack robustness guarantees. In this paper, we propose a technique that combines the strengths of these two approaches: constructing a generic nonlinear control policy class, parameterized by neural networks, that nonetheless enforces the same provable robustness criteria as robust control. Specifically, our approach entails integrating custom convex-optimization-based projection layers into a neural network-based policy. We demonstrate the power of this approach on several domains, improving in average-case performance over existing robust control methods and in worst-case stability over (non-robust) deep RL methods.

SYNov 15, 2020
Stability Analysis of Complementarity Systems with Neural Network Controllers

Alp Aydinoglu, Mahyar Fazlyab, Manfred Morari et al.

Complementarity problems, a class of mathematical optimization problems with orthogonality constraints, are widely used in many robotics tasks, such as locomotion and manipulation, due to their ability to model non-smooth phenomena (e.g., contact dynamics). In this paper, we propose a method to analyze the stability of complementarity systems with neural network controllers. First, we introduce a method to represent neural networks with rectified linear unit (ReLU) activations as the solution to a linear complementarity problem. Then, we show that systems with ReLU network controllers have an equivalent linear complementarity system (LCS) description. Using the LCS representation, we turn the stability verification problem into a linear matrix inequality (LMI) feasibility problem. We demonstrate the approach on several examples, including multi-contact problems and friction models with non-unique solutions.

OCMay 1, 2020
Robust Deep Learning as Optimal Control: Insights and Convergence Guarantees

Jacob H. Seidman, Mahyar Fazlyab, Victor M. Preciado et al.

The fragility of deep neural networks to adversarially-chosen inputs has motivated the need to revisit deep learning algorithms. Including adversarial examples during training is a popular defense mechanism against adversarial attacks. This mechanism can be formulated as a min-max optimization problem, where the adversary seeks to maximize the loss function using an iterative first-order algorithm while the learner attempts to minimize it. However, finding adversarial examples in this way causes excessive computational overhead during training. By interpreting the min-max problem as an optimal control problem, it has recently been shown that one can exploit the compositional structure of neural networks in the optimization problem to improve the training time significantly. In this paper, we provide the first convergence analysis of this adversarial training algorithm by combining techniques from robust optimal control and inexact oracle methods in optimization. Our analysis sheds light on how the hyperparameters of the algorithm affect the its stability and convergence. We support our insights with experiments on a robust classification problem.

SYApr 16, 2020
Reach-SDP: Reachability Analysis of Closed-Loop Systems with Neural Network Controllers via Semidefinite Programming

Haimin Hu, Mahyar Fazlyab, Manfred Morari et al.

There has been an increasing interest in using neural networks in closed-loop control systems to improve performance and reduce computational costs for on-line implementation. However, providing safety and stability guarantees for these systems is challenging due to the nonlinear and compositional structure of neural networks. In this paper, we propose a novel forward reachability analysis method for the safety verification of linear time-varying systems with neural networks in feedback interconnection. Our technical approach relies on abstracting the nonlinear activation functions by quadratic constraints, which leads to an outer-approximation of forward reachable sets of the closed-loop system. We show that we can compute these approximate reachable sets using semidefinite programming. We illustrate our method in a quadrotor example, in which we first approximate a nonlinear model predictive controller via a deep neural network and then apply our analysis tool to certify finite-time reachability and constraint satisfaction of the closed-loop system.

SYOct 9, 2019
Probabilistic Verification and Reachability Analysis of Neural Networks via Semidefinite Programming

Mahyar Fazlyab, Manfred Morari, George J. Pappas

Quantifying the robustness of neural networks or verifying their safety properties against input uncertainties or adversarial attacks have become an important research area in learning-enabled systems. Most results concentrate around the worst-case scenario where the input of the neural network is perturbed within a norm-bounded uncertainty set. In this paper, we consider a probabilistic setting in which the uncertainty is random with known first two moments. In this context, we discuss two relevant problems: (i) probabilistic safety verification, in which the goal is to find an upper bound on the probability of violating a safety specification; and (ii) confidence ellipsoid estimation, in which given a confidence ellipsoid for the input of the neural network, our goal is to compute a confidence ellipsoid for the output. Due to the presence of nonlinear activation functions, these two problems are very difficult to solve exactly. To simplify the analysis, our main idea is to abstract the nonlinear activation functions by a combination of affine and quadratic constraints they impose on their input-output pairs. We then show that the safety of the abstracted network, which is sufficient for the safety of the original network, can be analyzed using semidefinite programming. We illustrate the performance of our approach with numerical experiments.

LGJun 12, 2019
Efficient and Accurate Estimation of Lipschitz Constants for Deep Neural Networks

Mahyar Fazlyab, Alexander Robey, Hamed Hassani et al.

Tight estimation of the Lipschitz constant for deep neural networks (DNNs) is useful in many applications ranging from robustness certification of classifiers to stability analysis of closed-loop systems with reinforcement learning controllers. Existing methods in the literature for estimating the Lipschitz constant suffer from either lack of accuracy or poor scalability. In this paper, we present a convex optimization framework to compute guaranteed upper bounds on the Lipschitz constant of DNNs both accurately and efficiently. Our main idea is to interpret activation functions as gradients of convex potential functions. Hence, they satisfy certain properties that can be described by quadratic constraints. This particular description allows us to pose the Lipschitz constant estimation problem as a semidefinite program (SDP). The resulting SDP can be adapted to increase either the estimation accuracy (by capturing the interaction between activation functions of different layers) or scalability (by decomposition and parallel implementation). We illustrate the utility of our approach with a variety of experiments on randomly generated networks and on classifiers trained on the MNIST and Iris datasets. In particular, we experimentally demonstrate that our Lipschitz bounds are the most accurate compared to those in the literature. We also study the impact of adversarial training methods on the Lipschitz bounds of the resulting classifiers and show that our bounds can be used to efficiently provide robustness guarantees.

OCMar 4, 2019
Safety Verification and Robustness Analysis of Neural Networks via Quadratic Constraints and Semidefinite Programming

Mahyar Fazlyab, Manfred Morari, George J. Pappas

Certifying the safety or robustness of neural networks against input uncertainties and adversarial attacks is an emerging challenge in the area of safe machine learning and control. To provide such a guarantee, one must be able to bound the output of neural networks when their input changes within a bounded set. In this paper, we propose a semidefinite programming (SDP) framework to address this problem for feed-forward neural networks with general activation functions and input uncertainty sets. Our main idea is to abstract various properties of activation functions (e.g., monotonicity, bounded slope, bounded values, and repetition across layers) with the formalism of quadratic constraints. We then analyze the safety properties of the abstracted network via the S-procedure and semidefinite programming. Our framework spans the trade-off between conservatism and computational efficiency and applies to problems beyond safety verification. We evaluate the performance of our approach via numerical problem instances of various sizes.