79.3LGMay 29
Modeling Spectral Energy Shifts in Spatio-Temporal Graph Anomaly DetectionYilin Liu, Hongchao Zhang, Taylor T. Johnson et al.
Graph anomaly detection methods aim to distinguish anomalous nodes. While prior methods characterize anomalies through increased variation in the spectral energy distributions, they overlook those that result in decreased variation, i.e., camouflaged anomalies that appear normal. We show that this type of anomaly persists across multiple datasets and remains undetectable by existing spectral approaches. To address this limitation, we propose a node-level spectral energy formulation that is fully compatible with message passing and enables the detection of camouflaged anomalies. Building on this formulation, we introduce an energy-aware graph learning framework that models spectral shifts through energy-driven message passing in both static and time-series graphs. Besides, our unified architecture extends to temporal settings without introducing specialized sequence modules, enabling efficient learning under long sliding windows. Extensive experiments on large-scale benchmarks demonstrate the effectiveness and scalability of our approach.
ROMay 3, 2022Code
An Empirical Analysis of the Use of Real-Time Reachability for the Safety Assurance of Autonomous VehiclesPatrick Musau, Nathaniel Hamilton, Diego Manzanas Lopez et al.
Recent advances in machine learning technologies and sensing have paved the way for the belief that safe, accessible, and convenient autonomous vehicles may be realized in the near future. Despite tremendous advances within this context, fundamental challenges around safety and reliability are limiting their arrival and comprehensive adoption. Autonomous vehicles are often tasked with operating in dynamic and uncertain environments. As a result, they often make use of highly complex components, such as machine learning approaches, to handle the nuances of sensing, actuation, and control. While these methods are highly effective, they are notoriously difficult to assure. Moreover, within uncertain and dynamic environments, design time assurance analyses may not be sufficient to guarantee safety. Thus, it is critical to monitor the correctness of these systems at runtime. One approach for providing runtime assurance of systems with components that may not be amenable to formal analysis is the simplex architecture, where an unverified component is wrapped with a safety controller and a switching logic designed to prevent dangerous behavior. In this paper, we propose using a real-time reachability algorithm for the implementation of the simplex architecture to assure the safety of a 1/10 scale open source autonomous vehicle platform known as F1/10. The reachability algorithm that we leverage (a) provides provable guarantees of safety, and (b) is used to detect potentially unsafe scenarios. In our approach, the need to analyze an underlying controller is abstracted away, instead focusing on the effects of the controller's decisions on the system's future states. We demonstrate the efficacy of our architecture through a vast set of experiments conducted both in simulation and on an embedded hardware platform.
96.9SYJun 1
Power System CBFsAbdallah Alalem B. Albustami, Ahmad F. Taha, Taylor T. Johnson
Control barrier functions (CBFs) have become a standard tool in safety critical-control systems. CBFs convert state constraints into real time control conditions that certify forward invariance (meaning that once the system starts in a safe region, it remains there for all future times) and minimally modify a nominal controller only when safety is at risk. In power systems, CBF based methods have been proposed for frequency and voltage safety, but they largely remain disconnected from three key features that are central to power system operation: differential algebraic equation (DAE) models that capture network power flow constraints, safety specifications involving algebraic variables such as bus voltages, and formal verification of the resulting closed loop system. This paper closes this gap by developing a CBF framework for power system DAE models that supports safety constraints on both dynamic and algebraic variables. The framework provides real time safety filtering through an optimization layer that wraps around an existing controller and minimally modifies its command to enforce safety. In addition, it provides formal verification (i.e., a mathematical guarantee that all admissible trajectories satisfy the prescribed safety constraints) through an offline reachability based certificate of safe operation. The result is a unified filter and verify methodology for enforcing and certifying frequency and voltage safety in power systems while preserving the DAE structure of the underlying model.
LGJan 14, 2023
First Three Years of the International Verification of Neural Networks Competition (VNN-COMP)Christopher Brix, Mark Niklas Müller, Stanley Bak et al.
This paper presents a summary and meta-analysis of the first three iterations of the annual International Verification of Neural Networks Competition (VNN-COMP) held in 2020, 2021, and 2022. In the VNN-COMP, participants submit software tools that analyze whether given neural networks satisfy specifications describing their input-output behavior. These neural networks and specifications cover a variety of problem classes and tasks, corresponding to safety and robustness properties in image classification, neural control, reinforcement learning, and autonomous systems. We summarize the key processes, rules, and results, present trends observed over the last three years, and provide an outlook into possible future developments.
LGDec 20, 2022
The Third International Verification of Neural Networks Competition (VNN-COMP 2022): Summary and ResultsMark Niklas Müller, Christopher Brix, Stanley Bak et al. · eth-zurich
This report summarizes the 3rd International Verification of Neural Networks Competition (VNN-COMP 2022), held as a part of the 5th Workshop on Formal Methods for ML-Enabled Autonomous Systems (FoMLAS), which was collocated with the 34th International Conference on Computer-Aided Verification (CAV). VNN-COMP is held annually to facilitate the fair and objective comparison of state-of-the-art neural network verification tools, encourage the standardization of tool interfaces, and bring together the neural network verification community. To this end, standardized formats for networks (ONNX) and specification (VNN-LIB) were defined, tools were evaluated on equal-cost hardware (using an automatic evaluation pipeline based on AWS instances), and tool parameters were chosen by the participants before the final test sets were made public. In the 2022 iteration, 11 teams participated on a diverse set of 12 scored benchmarks. This report summarizes the rules, benchmarks, participating tools, results, and lessons learned from this iteration of this competition.
SYFeb 20, 2018
Reachable Set Estimation and Safety Verification for Piecewise Linear Systems with Neural Network ControllersWeiming Xiang, Hoang-Dung Tran, Joel A. Rosenfeld et al.
In this work, the reachable set estimation and safety verification problems for a class of piecewise linear systems equipped with neural network controllers are addressed. The neural network is considered to consist of Rectified Linear Unit (ReLU) activation functions. A layer-by-layer approach is developed for the output reachable set computation of ReLU neural networks. The computation is formulated in the form of a set of manipulations for a union of polytopes. Based on the output reachable set for neural network controllers, the output reachable set for a piecewise linear feedback control system can be estimated iteratively for a given finite-time interval. With the estimated output reachable set, the safety verification for piecewise linear systems with neural network controllers can be performed by checking the existence of intersections of unsafe regions and output reach set. A numerical example is presented to illustrate the effectiveness of our approach.
SYMay 25, 2018
Reachability Analysis and Safety Verification for Neural Network Control SystemsWeiming Xiang, Taylor T. Johnson
Autonomous cyber-physical systems (CPS) rely on the correct operation of numerous components, with state-of-the-art methods relying on machine learning (ML) and artificial intelligence (AI) components in various stages of sensing and control. This paper develops methods for estimating the reachable set and verifying safety properties of dynamical systems under control of neural network-based controllers that may be implemented in embedded software. The neural network controllers we consider are feedforward neural networks called multilayer perceptrons (MLP) with general activation functions. As such feedforward networks are memoryless, they may be abstractly represented as mathematical functions, and the reachability analysis of the network amounts to range (image) estimation of this function provided a set of inputs. By discretizing the input set of the MLP into a finite number of hyper-rectangular cells, our approach develops a linear programming (LP) based algorithm for over-approximating the output set of the MLP with its input set as a union of hyper-rectangular cells. Combining the over-approximation for the output set of an MLP based controller and reachable set computation routines for ordinary difference/differential equation (ODE) models, an algorithm is developed to estimate the reachable set of the closed-loop system. Finally, safety verification for neural network control systems can be performed by checking the existence of intersections between the estimated reachable set and unsafe regions. The approach is implemented in a computational software prototype and evaluated on numerical examples.
NAMar 5, 2019
Numerical Verification of Affine Systems with up to a Billion DimensionsStanley Bak, Hoang-Dung Tran, Taylor T. Johnson
Affine systems reachability is the basis of many verification methods. With further computation, methods exist to reason about richer models with inputs, nonlinear differential equations, and hybrid dynamics. As such, the scalability of affine systems verification is a prerequisite to scalable analysis for more complex systems. In this paper, we improve the scalability of affine systems verification, in terms of the number of dimensions (variables) in the system. The reachable states of affine systems can be written in terms of the matrix exponential, and safety checking can be performed at specific time steps with linear programming. Unfortunately, for large systems with many state variables, this direct approach requires an intractable amount of memory while using an intractable amount of computation time. We overcome these challenges by combining several methods that leverage common problem structure. Memory is reduced by exploiting initial states that are not full-dimensional and safety properties (outputs) over a few linear projections of the state variables. Computation time is saved by using numerical simulations to compute only projections of the matrix exponential relevant for the verification problem. Since large systems often have sparse dynamics, we use Krylov-subspace simulation approaches based on the Arnoldi or Lanczos iterations. Our method produces accurate counter-examples when properties are violated and, in the extreme case with sufficient problem structure, can analyze a system with one billion real-valued state variables.
SYFeb 10, 2018
Reachable Set Estimation and Verification for Neural Network Models of Nonlinear Dynamic SystemsWeiming Xiang, Diego Manzanas Lopez, Patrick Musau et al.
Neural networks have been widely used to solve complex real-world problems. Due to the complicate, nonlinear, non-convex nature of neural networks, formal safety guarantees for the behaviors of neural network systems will be crucial for their applications in safety-critical systems. In this paper, the reachable set estimation and verification problems for Nonlinear Autoregressive-Moving Average (NARMA) models in the forms of neural networks are addressed. The neural network involved in the model is a class of feed-forward neural networks called Multi-Layer Perceptron (MLP). By partitioning the input set of an MLP into a finite number of cells, a layer-by-layer computation algorithm is developed for reachable set estimation for each individual cell. The union of estimated reachable sets of all cells forms an over-approximation of reachable set of the MLP. Furthermore, an iterative reachable set estimation algorithm based on reachable set estimation for MLPs is developed for NARMA models. The safety verification can be performed by checking the existence of intersections of unsafe regions and estimated reachable set. Several numerical examples are provided to illustrate our approach.
LGJul 13, 2022
Reachability Analysis of a General Class of Neural Ordinary Differential EquationsDiego Manzanas Lopez, Patrick Musau, Nathaniel Hamilton et al.
Continuous deep learning models, referred to as Neural Ordinary Differential Equations (Neural ODEs), have received considerable attention over the last several years. Despite their burgeoning impact, there is a lack of formal analysis techniques for these systems. In this paper, we consider a general class of neural ODEs with varying architectures and layers, and introduce a novel reachability framework that allows for the formal analysis of their behavior. The methods developed for the reachability analysis of neural ODEs are implemented in a new tool called NNVODE. Specifically, our work extends an existing neural network verification tool to support neural ODEs. We demonstrate the capabilities and efficacy of our methods through the analysis of a set of benchmarks that include neural ODEs used for classification, and in control and dynamical systems, including an evaluation of the efficacy and capabilities of our approach with respect to existing software tools within the continuous-time systems reachability literature, when it is possible to do so.
SYFeb 20, 2016
Order-Reduction Abstractions for Safety Verification of High-Dimensional Linear SystemsHoang-Dung Tran, Luan Viet Nguyen, Weiming Xiang et al.
Order-reduction is a standard automated approximation technique for computer-aided design, analysis, and simulation of many classes of systems, from circuits to buildings. For a given system, these methods produce a reduced-order system where the dimension of the state-space is smaller, while attempting to preserve behaviors similar to those of the full-order original system. To be used as a sound abstraction for formal verification, a measure of the similarity of behavior must be formalized and computed, which we develop in a computational way for a class of linear systems and periodically-switched systems as the main contributions of this paper. We have implemented the order-reduction as a sound abstraction process through a source-to-source model transformation in the HyST tool and use SpaceEx to compute sets of reachable states to verify properties of the full-order system through analysis of the reduced-order system. Our experimental results suggest systems with on the order of a thousand state variables can be reduced to systems with tens of state variables such that the order-reduction overapproximation error is small enough to prove or disprove safety properties of interest using current reachability analysis tools. Our results illustrate this approach is effective to alleviate the state-space explosion problem for verification of high-dimensional linear systems.
CRSep 23, 2023
SUDS: Sanitizing Universal and Dependent SteganographyPreston K. Robinette, Hanchen D. Wang, Nishan Shehadeh et al.
Steganography, or hiding messages in plain sight, is a form of information hiding that is most commonly used for covert communication. As modern steganographic mediums include images, text, audio, and video, this communication method is being increasingly used by bad actors to propagate malware, exfiltrate data, and discreetly communicate. Current protection mechanisms rely upon steganalysis, or the detection of steganography, but these approaches are dependent upon prior knowledge, such as steganographic signatures from publicly available tools and statistical knowledge about known hiding methods. These dependencies render steganalysis useless against new or unique hiding methods, which are becoming increasingly common with the application of deep learning models. To mitigate the shortcomings of steganalysis, this work focuses on a deep learning sanitization technique called SUDS that is not reliant upon knowledge of steganographic hiding techniques and is able to sanitize universal and dependent steganography. SUDS is tested using least significant bit method (LSB), dependent deep hiding (DDH), and universal deep hiding (UDH). We demonstrate the capabilities and limitations of SUDS by answering five research questions, including baseline comparisons and an ablation study. Additionally, we apply SUDS to a real-world scenario, where it is able to increase the resistance of a poisoned classifier against attacks by 1375%.
61.6LGMay 7
Reward Shaping and Action Masking for Compositional Tasks using Behavior Trees and LLMsNicholas Potteiger, Ankita Samaddar, Taylor T. Johnson et al.
Decomposing complex tasks into a sequence of simpler subtasks can improve learning efficiency for an autonomous agent. Reinforcement learning (RL) can be used to optimize agent policies to complete subtasks, but requires well-defined subtask rewards and benefits from action masking. Recent work uses large language models (LLMs) to automate reward shaping and action masking, however none of them fully address reactivity to subtask failure and modularity to varying objects for compositional tasks. To overcome these challenges, we develop masking reward behavior tree (MRBT), a symbolic structure used as a reactive and modular reward and action mask function. We design an MRBT template and derive logical specifications to construct and verify MRBTs for a sequence of object-interaction subtasks. Further, we develop an automated pipeline that uses an LLM to generate MRBTs robust to varying task objects, an SMT-solver to verify correctness of specifications, and a neurosymbolic RL loop to train agents on compositional tasks. Experiments demonstrate successful generation and refinement of five MRBTs, consistently improving training efficiency and task success rates over baselines and MRBTs without action masking. We further highlight three advantages of MRBTs: transferability, modularity, and verifiability.
SCApr 9, 2018
Simulation-Based Reachability Analysis for High-Index Large Linear Differential Algebraic EquationsHoang-Dung Tran, Weiming Xiang, Nathaniel Hamilton et al.
Reachability analysis is a fundamental problem for safety verification and falsification of Cyber-Physical Systems (CPS) whose dynamics follow physical laws usually represented as differential equations. In the last two decades, numerous reachability analysis methods and tools have been proposed for a common class of dynamics in CPS known as ordinary differential equations (ODE). However, there is lack of methods dealing with differential algebraic equations (DAE) which is a more general class of dynamics that is widely used to describe a variety of problems from engineering and science such as multibody mechanics, electrical cicuit design, incompressible fluids, molecular dynamics and chemcial process control. Reachability analysis for DAE systems is more complex than ODE systems, especially for high-index DAEs because they contain both a differential part (i.e., ODE) and algebraic constraints (AC). In this paper, we extend the recent scalable simulation-based reachability analysis in combination with decoupling techniques for a class of high-index large linear DAEs. In particular, a high-index linear DAE is first decoupled into one ODE and one or several AC subsystems based on the well-known Marz decoupling method ultilizing admissible projectors. Then, the discrete reachable set of the DAE, represented as a list of star-sets, is computed using simulation. Unlike ODE reachability analysis where the initial condition is freely defined by a user, in DAE cases, the consistency of the inititial condition is an essential requirement to guarantee a feasible solution. Therefore, a thorough check for the consistency is invoked before computing the discrete reachable set. Our approach sucessfully verifies (or falsifies) a wide range of practical, high-index linear DAE systems in which the number of state variables varies from several to thousands.
54.4LGApr 5
Towards Verified and Targeted Explanations through Formal MethodsHanchen David Wang, Diego Manzanas Lopez, Preston K. Robinette et al.
As deep neural networks are deployed in safety-critical domains such as autonomous driving and medical diagnosis, stakeholders need explanations that are interpretable but also trustworthy with formal guarantees. Existing XAI methods fall short: heuristic attribution techniques (e.g., LIME, Integrated Gradients) highlight influential features but offer no mathematical guarantees about decision boundaries, while formal methods verify robustness yet remain untargeted, analyzing the nearest boundary regardless of whether it represents a critical risk. In safety-critical systems, not all misclassifications carry equal consequences; confusing a "Stop" sign for a "60 kph" sign is far more dangerous than confusing it with a "No Passing" sign. We introduce ViTaX (Verified and Targeted Explanations), a formal XAI framework that generates targeted semifactual explanations with mathematical guarantees. For a given input (class y) and a user-specified critical alternative (class t), ViTaX: (1) identifies the minimal feature subset most sensitive to the y->t transition, and (2) applies formal reachability analysis to guarantee that perturbing these features by epsilon cannot flip the classification to t. We formalize this through Targeted epsilon-Robustness, certifying whether a feature subset remains robust under perturbation toward a specific target class. ViTaX is the first method to provide formally guaranteed explanations of a model's resilience against user-identified alternatives. Evaluations on MNIST, GTSRB, EMNIST, and TaxiNet demonstrate over 30% fidelity improvement with minimal explanation cardinality.
60.7SYMay 19
k-Inductive Neural Barrier Certificates for Unknown Nonlinear DynamicsBen Wooding, Hongchao Zhang, Taylor T. Johnson et al.
While conventional (k=1) discrete-time barrier certificate conditions impose strict safety constraints by requiring the function to be non-increasing at every step, k-inductive barrier certificates relax this by allowing a temporary increase -- up to k-1 times, each within a threshold $ε$ -- while maintaining overall safety, and improving flexibility. This paper leverages neural networks and constructs k-inductive neural barrier certificates (k-NBCs) for (partially) unknown nonlinear systems. While neural networks offer scalability in the design process, they lack formal guarantees, requiring additional approaches such as counterexample-guided inductive synthesis (CEGIS) with satisfiability modulo theories (SMT) for verification. However, the CEGIS-SMT framework requires knowledge of system dynamics, which is unavailable in practical settings. To address this, we leverage the generalization of the Willems et al.'s fundamental lemma, using a single state trajectory, to construct a data-driven representation of (partially) unknown models for SMT verification without sacrificing accuracy. Additionally, CEGIS-SMT further removes the constraint of restricting barrier certificates to specific function classes, such as sum-of-squares, enabling greater flexibility in their design. We validate our approach on three nonlinear case studies with (partially) unknown dynamics.
72.2SYMar 13
Verification and Forward Invariance of Control Barrier Functions for Differential-Algebraic SystemsHongchao Zhang, Mohamad H. Kazma, Meiyi Ma et al.
Differential-algebraic equations (DAEs) arise in power networks, chemical processes, and multibody systems, where algebraic constraints encode physical conservation laws. The safety of such systems is critical, yet safe control is challenging because algebraic constraints restrict allowable state trajectories. Control barrier functions (CBFs) provide computationally efficient safety filters for ordinary differential equation (ODE) systems. However, existing CBF methods are not directly applicable to DAEs due to potential conflicts between the CBF condition and the constraint manifold. This paper introduces DAE-aware CBFs that incorporate the differential-algebraic structure through projected vector fields. We derive conditions that ensure forward invariance of safe sets while preserving algebraic constraints and extend the framework to higher-index DAEs. A systematic verification framework is developed, establishing necessary and sufficient conditions for geometric correctness and feasibility of DAE-aware CBFs. For polynomial systems, sum-of-squares certificates are provided, while for nonpolynomial and neural network candidates, satisfiability modulo theories are used for falsification. The approach is validated on wind turbine and flexible-link manipulator systems.
LGDec 4, 2024Code
PBP: Post-training Backdoor Purification for Malware ClassifiersDung Thuy Nguyen, Ngoc N. Tran, Taylor T. Johnson et al.
In recent years, the rise of machine learning (ML) in cybersecurity has brought new challenges, including the increasing threat of backdoor poisoning attacks on ML malware classifiers. For instance, adversaries could inject malicious samples into public malware repositories, contaminating the training data and potentially misclassifying malware by the ML model. Current countermeasures predominantly focus on detecting poisoned samples by leveraging disagreements within the outputs of a diverse set of ensemble models on training data points. However, these methods are not suitable for scenarios where Machine Learning-as-a-Service (MLaaS) is used or when users aim to remove backdoors from a model after it has been trained. Addressing this scenario, we introduce PBP, a post-training defense for malware classifiers that mitigates various types of backdoor embeddings without assuming any specific backdoor embedding mechanism. Our method exploits the influence of backdoor attacks on the activation distribution of neural networks, independent of the trigger-embedding method. In the presence of a backdoor attack, the activation distribution of each layer is distorted into a mixture of distributions. By regulating the statistics of the batch normalization layers, we can guide a backdoored model to perform similarly to a clean one. Our method demonstrates substantial advantages over several state-of-the-art methods, as evidenced by experiments on two datasets, two types of backdoor methods, and various attack configurations. Notably, our approach requires only a small portion of the training data -- only 1\% -- to purify the backdoor and reduce the attack success rate from 100\% to almost 0\%, a 100-fold improvement over the baseline methods. Our code is available at \url{https://github.com/judydnguyen/pbp-backdoor-purification-official}.
LGDec 22, 2025
The 6th International Verification of Neural Networks Competition (VNN-COMP 2025): Summary and ResultsKonstantin Kaulen, Tobias Ladner, Stanley Bak et al.
This report summarizes the 6th International Verification of Neural Networks Competition (VNN-COMP 2025), held as a part of the 8th International Symposium on AI Verification (SAIV), that was collocated with the 37th International Conference on Computer-Aided Verification (CAV). VNN-COMP is held annually to facilitate the fair and objective comparison of state-of-the-art neural network verification tools, encourage the standardization of tool interfaces, and bring together the neural network verification community. To this end, standardized formats for networks (ONNX) and specification (VNN-LIB) were defined, tools were evaluated on equal-cost hardware (using an automatic evaluation pipeline based on AWS instances), and tool parameters were chosen by the participants before the final test sets were made public. In the 2025 iteration, 8 teams participated on a diverse set of 16 regular and 9 extended benchmarks. This report summarizes the rules, benchmarks, participating tools, results, and lessons learned from this iteration of this competition.
CRNov 5, 2024Code
Formal Logic-guided Robust Federated Learning against Poisoning AttacksDung Thuy Nguyen, Ziyan An, Taylor T. Johnson et al.
Federated Learning (FL) offers a promising solution to the privacy concerns associated with centralized Machine Learning (ML) by enabling decentralized, collaborative learning. However, FL is vulnerable to various security threats, including poisoning attacks, where adversarial clients manipulate the training data or model updates to degrade overall model performance. Recognizing this threat, researchers have focused on developing defense mechanisms to counteract poisoning attacks in FL systems. However, existing robust FL methods predominantly focus on computer vision tasks, leaving a gap in addressing the unique challenges of FL with time series data. In this paper, we present FLORAL, a defense mechanism designed to mitigate poisoning attacks in federated learning for time-series tasks, even in scenarios with heterogeneous client data and a large number of adversarial participants. Unlike traditional model-centric defenses, FLORAL leverages logical reasoning to evaluate client trustworthiness by aligning their predictions with global time-series patterns, rather than relying solely on the similarity of client updates. Our approach extracts logical reasoning properties from clients, then hierarchically infers global properties, and uses these to verify client updates. Through formal logic verification, we assess the robustness of each client contribution, identifying deviations indicative of adversarial behavior. Experimental results on two datasets demonstrate the superior performance of our approach compared to existing baseline methods, highlighting its potential to enhance the robustness of FL to time series applications. Notably, FLORAL reduced the prediction error by 93.27% in the best-case scenario compared to the second-best baseline. Our code is available at https://anonymous.4open.science/r/FLORAL-Robust-FTS.
CVSep 15, 2025Code
Probabilistic Robustness Analysis in High Dimensional Space: Application to Semantic Segmentation NetworkNavid Hashemi, Samuel Sasaki, Diego Manzanas Lopez et al.
Semantic segmentation networks (SSNs) are central to safety-critical applications such as medical imaging and autonomous driving, where robustness under uncertainty is essential. However, existing probabilistic verification methods often fail to scale with the complexity and dimensionality of modern segmentation tasks, producing guarantees that are overly conservative and of limited practical value. We propose a probabilistic verification framework that is architecture-agnostic and scalable to high-dimensional input-output spaces. Our approach employs conformal inference (CI), enhanced by a novel technique that we call the \textbf{clipping block}, to provide provable guarantees while mitigating the excessive conservatism of prior methods. Experiments on large-scale segmentation models across CamVid, OCTA-500, Lung Segmentation, and Cityscapes demonstrate that our framework delivers reliable safety guarantees while substantially reducing conservatism compared to state-of-the-art approaches on segmentation tasks. We also provide a public GitHub repository (https://github.com/Navidhashemicodes/SSN_Reach_CLP_Surrogate) for this approach, to support reproducibility.
LGOct 30, 2024Code
PARDON: Privacy-Aware and Robust Federated Domain GeneralizationDung Thuy Nguyen, Taylor T. Johnson, Kevin Leach
Federated Learning (FL) shows promise in preserving privacy and enabling collaborative learning. However, most current solutions focus on private data collected from a single domain. A significant challenge arises when client data comes from diverse domains (i.e., domain shift), leading to poor performance on unseen domains. Existing Federated Domain Generalization approaches address this problem but assume each client holds data for an entire domain, limiting their practicality in real-world scenarios with domain-based heterogeneity and client sampling. In addition, certain methods enable information sharing among clients, raising privacy concerns as this information could be used to reconstruct sensitive private data. To overcome this, we introduce FISC, a novel FedDG paradigm designed to robustly handle more complicated domain distributions between clients while ensuring security. FISC enables learning across domains by extracting an interpolative style from local styles and employing contrastive learning. This strategy gives clients multi-domain representations and unbiased convergent targets. Empirical results on multiple datasets, including PACS, Office-Home, and IWildCam, show FISC outperforms state-of-the-art (SOTA) methods. Our method achieves accuracy on unseen domains, with improvements ranging from 3.64% to 57.22% on unseen domains. Our code is available at https://github.com/judydnguyen/PARDON-FedDG.
LGDec 28, 2023
The Fourth International Verification of Neural Networks Competition (VNN-COMP 2023): Summary and ResultsChristopher Brix, Stanley Bak, Changliu Liu et al.
This report summarizes the 4th International Verification of Neural Networks Competition (VNN-COMP 2023), held as a part of the 6th Workshop on Formal Methods for ML-Enabled Autonomous Systems (FoMLAS), that was collocated with the 35th International Conference on Computer-Aided Verification (CAV). VNN-COMP is held annually to facilitate the fair and objective comparison of state-of-the-art neural network verification tools, encourage the standardization of tool interfaces, and bring together the neural network verification community. To this end, standardized formats for networks (ONNX) and specification (VNN-LIB) were defined, tools were evaluated on equal-cost hardware (using an automatic evaluation pipeline based on AWS instances), and tool parameters were chosen by the participants before the final test sets were made public. In the 2023 iteration, 7 teams participated on a diverse set of 10 scored and 4 unscored benchmarks. This report summarizes the rules, benchmarks, participating tools, results, and lessons learned from this iteration of this competition.
LGDec 28, 2024
The Fifth International Verification of Neural Networks Competition (VNN-COMP 2024): Summary and ResultsChristopher Brix, Stanley Bak, Taylor T. Johnson et al.
This report summarizes the 5th International Verification of Neural Networks Competition (VNN-COMP 2024), held as a part of the 7th International Symposium on AI Verification (SAIV), that was collocated with the 36th International Conference on Computer-Aided Verification (CAV). VNN-COMP is held annually to facilitate the fair and objective comparison of state-of-the-art neural network verification tools, encourage the standardization of tool interfaces, and bring together the neural network verification community. To this end, standardized formats for networks (ONNX) and specification (VNN-LIB) were defined, tools were evaluated on equal-cost hardware (using an automatic evaluation pipeline based on AWS instances), and tool parameters were chosen by the participants before the final test sets were made public. In the 2024 iteration, 8 teams participated on a diverse set of 12 regular and 8 extended benchmarks. This report summarizes the rules, benchmarks, participating tools, results, and lessons learned from this iteration of this competition.
PLJan 10, 2025
Neural Network Verification is a Programming Language ChallengeLucas C. Cordeiro, Matthew L. Daggitt, Julien Girard-Satabin et al.
Neural network verification is a new and rapidly developing field of research. So far, the main priority has been establishing efficient verification algorithms and tools, while proper support from the programming language perspective has been considered secondary or unimportant. Yet, there is mounting evidence that insights from the programming language community may make a difference in the future development of this domain. In this paper, we formulate neural network verification challenges as programming language challenges and suggest possible future solutions.
AIJan 15, 2024
Formal Logic Enabled Personalized Federated Learning Through Property InferenceZiyan An, Taylor T. Johnson, Meiyi Ma
Recent advancements in federated learning (FL) have greatly facilitated the development of decentralized collaborative applications, particularly in the domain of Artificial Intelligence of Things (AIoT). However, a critical aspect missing from the current research landscape is the ability to enable data-driven client models with symbolic reasoning capabilities. Specifically, the inherent heterogeneity of participating client devices poses a significant challenge, as each client exhibits unique logic reasoning properties. Failing to consider these device-specific specifications can result in critical properties being missed in the client predictions, leading to suboptimal performance. In this work, we propose a new training paradigm that leverages temporal logic reasoning to address this issue. Our approach involves enhancing the training process by incorporating mechanically generated logic expressions for each FL client. Additionally, we introduce the concept of aggregation clusters and develop a partitioning algorithm to effectively group clients based on the alignment of their temporal reasoning properties. We evaluate the proposed method on two tasks: a real-world traffic volume prediction task consisting of sensory data from fifteen states and a smart city multi-task prediction utilizing synthetic data. The evaluation results exhibit clear improvements, with performance accuracy improved by up to 54% across all sequential prediction models.
AIDec 27, 2023
Robustness Verification for Knowledge-Based Logic of Risky Driving ScenesXia Wang, Anda Liang, Jonathan Sprinkle et al.
Many decision-making scenarios in modern life benefit from the decision support of artificial intelligence algorithms, which focus on a data-driven philosophy and automated programs or systems. However, crucial decision issues related to security, fairness, and privacy should consider more human knowledge and principles to supervise such AI algorithms to reach more proper solutions and to benefit society more effectively. In this work, we extract knowledge-based logic that defines risky driving formats learned from public transportation accident datasets, which haven't been analyzed in detail to the best of our knowledge. More importantly, this knowledge is critical for recognizing traffic hazards and could supervise and improve AI models in safety-critical systems. Then we use automated verification methods to verify the robustness of such logic. More specifically, we gather 72 accident datasets from Data.gov and organize them by state. Further, we train Decision Tree and XGBoost models on each state's dataset, deriving accident judgment logic. Finally, we deploy robustness verification on these tree-based models under multiple parameter combinations.
AIMay 1, 2025
Combining LLMs with Logic-Based Framework to Explain MCTSZiyan An, Xia Wang, Hendrik Baier et al.
In response to the lack of trust in Artificial Intelligence (AI) for sequential planning, we design a Computational Tree Logic-guided large language model (LLM)-based natural language explanation framework designed for the Monte Carlo Tree Search (MCTS) algorithm. MCTS is often considered challenging to interpret due to the complexity of its search trees, but our framework is flexible enough to handle a wide range of free-form post-hoc queries and knowledge-based inquiries centered around MCTS and the Markov Decision Process (MDP) of the application domain. By transforming user queries into logic and variable statements, our framework ensures that the evidence obtained from the search tree remains factually consistent with the underlying environmental dynamics and any constraints in the actual stochastic control process. We evaluate the framework rigorously through quantitative assessments, where it demonstrates strong performance in terms of accuracy and factual consistency.
CVFeb 4, 2025
Blind Visible Watermark Removal with Morphological DilationPreston K. Robinette, Taylor T. Johnson
Visible watermarks pose significant challenges for image restoration techniques, especially when the target background is unknown. Toward this end, we present MorphoMod, a novel method for automated visible watermark removal that operates in a blind setting -- without requiring target images. Unlike existing methods, MorphoMod effectively removes opaque and transparent watermarks while preserving semantic content, making it well-suited for real-world applications. Evaluations on benchmark datasets, including the Colored Large-scale Watermark Dataset (CLWD), LOGO-series, and the newly introduced Alpha1 datasets, demonstrate that MorphoMod achieves up to a 50.8% improvement in watermark removal effectiveness compared to state-of-the-art methods. Ablation studies highlight the impact of prompts used for inpainting, pre-removal filling strategies, and inpainting model performance on watermark removal. Additionally, a case study on steganographic disorientation reveals broader applications for watermark removal in disrupting high-level hidden messages. MorphoMod offers a robust, adaptable solution for watermark removal and opens avenues for further advancements in image restoration and adversarial manipulation.
SYApr 12, 2020
NNV: The Neural Network Verification Tool for Deep Neural Networks and Learning-Enabled Cyber-Physical SystemsHoang-Dung Tran, Xiaodong Yang, Diego Manzanas Lopez et al.
This paper presents the Neural Network Verification (NNV) software tool, a set-based verification framework for deep neural networks (DNNs) and learning-enabled cyber-physical systems (CPS). The crux of NNV is a collection of reachability algorithms that make use of a variety of set representations, such as polyhedra, star sets, zonotopes, and abstract-domain representations. NNV supports both exact (sound and complete) and over-approximate (sound) reachability algorithms for verifying safety and robustness properties of feed-forward neural networks (FFNNs) with various activation functions. For learning-enabled CPS, such as closed-loop control systems incorporating neural networks, NNV provides exact and over-approximate reachability analysis schemes for linear plant models and FFNN controllers with piecewise-linear activation functions, such as ReLUs. For similar neural network control systems (NNCS) that instead have nonlinear plant models, NNV supports over-approximate analysis by combining the star set analysis used for FFNN controllers with zonotope-based analysis for nonlinear plant dynamics building on CORA. We evaluate NNV using two real-world case studies: the first is safety verification of ACAS Xu networks and the second deals with the safety verification of a deep learning-based adaptive cruise control system.
LGApr 12, 2020
Verification of Deep Convolutional Neural Networks Using ImageStarsHoang-Dung Tran, Stanley Bak, Weiming Xiang et al.
Convolutional Neural Networks (CNN) have redefined the state-of-the-art in many real-world applications, such as facial recognition, image classification, human pose estimation, and semantic segmentation. Despite their success, CNNs are vulnerable to adversarial attacks, where slight changes to their inputs may lead to sharp changes in their output in even well-trained networks. Set-based analysis methods can detect or prove the absence of bounded adversarial attacks, which can then be used to evaluate the effectiveness of neural network training methodology. Unfortunately, existing verification approaches have limited scalability in terms of the size of networks that can be analyzed. In this paper, we describe a set-based framework that successfully deals with real-world CNNs, such as VGG16 and VGG19, that have high accuracy on ImageNet. Our approach is based on a new set representation called the ImageStar, which enables efficient exact and over-approximative analysis of CNNs. ImageStars perform efficient set-based analysis by combining operations on concrete images with linear programming (LP). Our approach is implemented in a tool called NNV, and can verify the robustness of VGG networks with respect to a small set of input states, derived from adversarial attacks, such as the DeepFool attack. The experimental results show that our approach is less conservative and faster than existing zonotope methods, such as those used in DeepZ, and the polytope method used in DeepPoly.
LGDec 14, 2018
Specification-Guided Safety Verification for Feedforward Neural NetworksWeiming Xiang, Hoang-Dung Tran, Taylor T. Johnson
This paper presents a specification-guided safety verification method for feedforward neural networks with general activation functions. As such feedforward networks are memoryless, they can be abstractly represented as mathematical functions, and the reachability analysis of the neural network amounts to interval analysis problems. In the framework of interval analysis, a computationally efficient formula which can quickly compute the output interval sets of a neural network is developed. Then, a specification-guided reachability algorithm is developed. Specifically, the bisection process in the verification algorithm is completely guided by a given safety specification. Due to the employment of the safety specification, unnecessary computations are avoided and thus the computational cost can be reduced significantly. Experiments show that the proposed method enjoys much more efficiency in safety verification with significantly less computational cost.
AIOct 3, 2018
Verification for Machine Learning, Autonomy, and Neural Networks SurveyWeiming Xiang, Patrick Musau, Ayana A. Wild et al.
This survey presents an overview of verification techniques for autonomous systems, with a focus on safety-critical autonomous cyber-physical systems (CPS) and subcomponents thereof. Autonomy in CPS is enabling by recent advances in artificial intelligence (AI) and machine learning (ML) through approaches such as deep neural networks (DNNs), embedded in so-called learning enabled components (LECs) that accomplish tasks from classification to control. Recently, the formal methods and formal verification community has developed methods to characterize behaviors in these LECs with eventual goals of formally verifying specifications for LECs, and this article presents a survey of many of these recent approaches.
SYJun 24, 2018
Cyber-Physical Specification MismatchesLuan V. Nguyen, Khaza Anuarul Hoque, Stanley Bak et al.
Embedded systems use increasingly complex software and are evolving into cyber-physical systems (CPS) with sophisticated interaction and coupling between physical and computational processes. Many CPS operate in safety-critical environments and have stringent certification, reliability, and correctness requirements. These systems undergo changes throughout their lifetimes, where either the software or physical hardware is updated in subsequent design iterations. One source of failure in safety-critical CPS is when there are unstated assumptions in either the physical or cyber parts of the system, and new components do not match those assumptions. In this work, we present an automated method towards identifying unstated assumptions in CPS. Dynamic specifications in the form of candidate invariants of both the software and physical components are identified using dynamic analysis (executing and/or simulating the system implementation or model thereof). A prototype tool called Hynger (for HYbrid iNvariant GEneratoR) was developed that instruments Simulink/Stateflow (SLSF) model diagrams to generate traces in the input format compatible with the Daikon invariant inference tool, which has been extensively applied to software systems. Hynger, in conjunction with Daikon, is able to detect candidate invariants of several CPS case studies. We use the running example of a DC-to-DC power converter, and demonstrate that Hynger can detect a specification mismatch where a tolerance assumed by the software is violated due to a plant change. Another case study of an automotive control system is also introduced to illustrate the power of Hynger and Daikon in automatically identifying cyber-physical specification mismatches.
LGDec 21, 2017
Reachable Set Computation and Safety Verification for Neural Networks with ReLU ActivationsWeiming Xiang, Hoang-Dung Tran, Taylor T. Johnson
Neural networks have been widely used to solve complex real-world problems. Due to the complicate, nonlinear, non-convex nature of neural networks, formal safety guarantees for the output behaviors of neural networks will be crucial for their applications in safety-critical systems.In this paper, the output reachable set computation and safety verification problems for a class of neural networks consisting of Rectified Linear Unit (ReLU) activation functions are addressed. A layer-by-layer approach is developed to compute output reachable set. The computation is formulated in the form of a set of manipulations for a union of polyhedra, which can be efficiently applied with the aid of polyhedron computation tools. Based on the output reachable set computation results, the safety verification for a ReLU neural network can be performed by checking the intersections of unsafe regions and output reachable set described by a union of polyhedra. A numerical example of a randomly generated ReLU neural network is provided to show the effectiveness of the approach developed in this paper.
LGAug 9, 2017
Output Reachable Set Estimation and Verification for Multi-Layer Neural NetworksWeiming Xiang, Hoang-Dung Tran, Taylor T. Johnson
In this paper, the output reachable estimation and safety verification problems for multi-layer perceptron neural networks are addressed. First, a conception called maximum sensitivity in introduced and, for a class of multi-layer perceptrons whose activation functions are monotonic functions, the maximum sensitivity can be computed via solving convex optimization problems. Then, using a simulation-based method, the output reachable set estimation problem for neural networks is formulated into a chain of optimization problems. Finally, an automated safety verification is developed based on the output reachable set estimation result. An application to the safety verification for a robotic arm model with two joints is presented to show the effectiveness of proposed approaches.
ROSep 10, 2012
Safe and Stabilizing Distributed Multi-Path Cellular FlowsTaylor T. Johnson, Sayan Mitra
We study the problem of distributed traffic control in the partitioned plane, where the movement of all entities (robots, vehicles, etc.) within each partition (cell) is coupled. Establishing liveness in such systems is challenging, but such analysis will be necessary to apply such distributed traffic control algorithms in applications like coordinating robot swarms and the intelligent highway system. We present a formal model of a distributed traffic control protocol that guarantees minimum separation between entities, even as some cells fail. Once new failures cease occurring, in the case of a single target, the protocol is guaranteed to self-stabilize and the entities with feasible paths to the target cell make progress towards it. For multiple targets, failures may cause deadlocks in the system, so we identify a class of non-deadlocking failures where all entities are able to make progress to their respective targets. The algorithm relies on two general principles: temporary blocking for maintenance of safety and local geographical routing for guaranteeing progress. Our assertional proofs may serve as a template for the analysis of other distributed traffic control protocols. We present simulation results that provide estimates of throughput as a function of entity velocity, safety separation, single-target path complexity, failure-recovery rates, and multi-target path complexity.