16.2SEMay 25Code
ESBMC: A Survey of Its Evolution, Integration, and Future Directions in Formal Software VerificationPierre Dantas, Lucas Cordeiro, Waldir Junior
The Efficient SMT-Based Context-Bounded Model Checker (ESBMC) has grown from a research prototype for verifying embedded ANSI-C software into one of the most versatile and industrially capable formal verification platforms available today. Since its first publication in 2009, ESBMC has undergone persistent evolution: expanding its verification techniques, widening its language support to nine front-ends, integrating industrial-strength SMT solvers, and - most recently - coupling with Large Language Models (LLMs) and autonomous AI agents. This survey traces the full trajectory of ESBMC from its original design principles to the state of the art in 2025-2026, documenting 43 awards at SV-COMP and Test-Comp, its role as a formal verification backend for LLM-driven self-healing software and loop invariant generation, and the first industrial deployment of an integrated agentic model-checking architecture through the NVIDIA-OpenSMA framework, establishing ESBMC as a natively autonomous verification kernel rather than a passive validation backend. We synthesize its economic impact - over GBP 9.3 million and EUR 4.98 million in confirmed public research funding, the VeriBee spin-off, and a defense industrial deployment at Lockheed Martin - and conclude with a structured agenda of open challenges spanning scalability, neurosymbolic verification, counterexample intelligibility, cross-language verification, safety standards compliance, and open-source sustainability.
SYMay 6, 2017
Automated Formal Synthesis of Digital Controllers for State-Space Physical PlantsAlessandro Abate, Iury Bessa, Dario Cattaruzza et al.
We present a sound and automated approach to synthesize safe digital feedback controllers for physical plants represented as linear, time invariant models. Models are given as dynamical equations with inputs, evolving over a continuous state space and accounting for errors due to the digitalization of signals by the controller. Our approach has two stages, leveraging counterexample guided inductive synthesis (CEGIS) and reachability analysis. CEGIS synthesizes a static feedback controller that stabilizes the system under restrictions given by the safety of the reach space. Safety is verified either via BMC or abstract acceleration; if the verification step fails, we refine the controller by generalizing the counterexample. We synthesize stable and safe controllers for intricate physical plant models from the digital control literature.
SYFeb 16, 2017
Sound and Automated Synthesis of Digital Stabilizing Controllers for Continuous PlantsAlessandro Abate, Iury Bessa, Dario Cattaruzza et al.
Modern control is implemented with digital microcontrollers, embedded within a dynamical plant that represents physical components. We present a new algorithm based on counter-example guided inductive synthesis that automates the design of digital controllers that are correct by construction. The synthesis result is sound with respect to the complete range of approximations, including time discretization, quantization effects, and finite-precision arithmetic and its rounding errors. We have implemented our new algorithm in a tool called DSSynth, and are able to automatically generate stable controllers for a set of intricate plant models taken from the literature within minutes.
SYFeb 18, 2017
Verifying Digital Systems with MATLABLennon Chaves, Iury Bessa, Lucas Cordeiro et al.
A MATLAB toolbox is presented, with the goal of checking occurrences of design errors typically found in fixed-point digital systems, considering finite word-length effects. In particular, the present toolbox works as a front-end to a recently introduced verification tool, known as Digital-System Verifier, and checks overflow, limit cycle, quantization, stability, and minimum phase errors, in digital systems represented by transfer-function and state-space equations. It provides a command-line version, with simplified access to specific functions, and a graphical-user interface, which was developed as a MATLAB application. The resulting toolbox is important for the verification community, since it shows the applicability of verification to real-world systems.
NENov 24, 2022
AIREPAIR: A Repair Platform for Neural NetworksXidan Song, Youcheng Sun, Mustafa A. Mustafa et al.
We present AIREPAIR, a platform for repairing neural networks. It features the integration of existing network repair tools. Based on AIREPAIR, one can run different repair methods on the same model, thus enabling the fair comparison of different repair techniques. We evaluate AIREPAIR with three state-of-the-art repair tools on popular deep-learning datasets and models. Our evaluation confirms the utility of AIREPAIR, by comparing and analyzing the results from different repair techniques. A demonstration is available at https://youtu.be/UkKw5neeWhw.
CLApr 26, 2022
Systematicity, Compositionality and Transitivity of Deep NLP Models: a Metamorphic Testing PerspectiveEdoardo Manino, Julia Rozanova, Danilo Carvalho et al.
Metamorphic testing has recently been used to check the safety of neural NLP models. Its main advantage is that it does not rely on a ground truth to generate test cases. However, existing studies are mostly concerned with robustness-like metamorphic relations, limiting the scope of linguistic properties they can test. We propose three new classes of metamorphic relations, which address the properties of systematicity, compositionality and transitivity. Unlike robustness, our relations are defined over multiple source inputs, thus increasing the number of test cases that we can produce by a polynomial factor. With them, we test the internal consistency of state-of-the-art NLP models, and show that they do not always behave according to their expected linguistic properties. Lastly, we introduce a novel graphical notation that efficiently summarises the inner structure of metamorphic relations.
CLApr 20, 2023
Interventional Probing in High Dimensions: An NLI Case StudyJulia Rozanova, Marco Valentino, Lucas Cordeiro et al.
Probing strategies have been shown to detect the presence of various linguistic features in large language models; in particular, semantic features intermediate to the "natural logic" fragment of the Natural Language Inference task (NLI). In the case of natural logic, the relation between the intermediate features and the entailment label is explicitly known: as such, this provides a ripe setting for interventional studies on the NLI models' representations, allowing for stronger causal conjectures and a deeper critical analysis of interventional probing methods. In this work, we carry out new and existing representation-level interventions to investigate the effect of these semantic features on NLI classification: we perform amnesic probing (which removes features as directed by learned linear probes) and introduce the mnestic probing variation (which forgets all dimensions except the probe-selected ones). Furthermore, we delve into the limitations of these methods and outline some pitfalls have been obscuring the effectivity of interventional probing studies.
LGOct 21, 2022
Towards Global Neural Network Abstractions with Locally-Exact ReconstructionEdoardo Manino, Iury Bessa, Lucas Cordeiro
Neural networks are a powerful class of non-linear functions. However, their black-box nature makes it difficult to explain their behaviour and certify their safety. Abstraction techniques address this challenge by transforming the neural network into a simpler, over-approximated function. Unfortunately, existing abstraction techniques are slack, which limits their applicability to small local regions of the input domain. In this paper, we propose Global Interval Neural Network Abstractions with Center-Exact Reconstruction (GINNACER). Our novel abstraction technique produces sound over-approximation bounds over the whole input domain while guaranteeing exact reconstructions for any given local input. Our experiments show that GINNACER is several orders of magnitude tighter than state-of-the-art global abstraction techniques, while being competitive with local ones.
LOFeb 28, 2017
Automated Verification and Synthesis of Embedded Systems using Machine LearningLucas Cordeiro
The dependency on the correct functioning of embedded systems is rapidly growing, mainly due to their wide range of applications, such as micro-grids, automotive device control, health care, surveillance, mobile devices, and consumer electronics. Their structures are becoming more and more complex and now require multi-core processors with scalable shared memory, in order to meet increasing computational power demands. As a consequence, reliability of embedded (distributed) software becomes a key issue during system development, which must be carefully addressed and assured. The present research discusses challenges, problems, and recent advances to ensure correctness and timeliness regarding embedded systems. Reliability issues, in the development of micro-grids and cyber-physical systems, are then considered, as a prominent verification and synthesis application. In particular, machine learning techniques emerge as one of the main approaches to learn reliable implementations of embedded software for achieving a correct-by-construction design.
CLOct 10, 2022
Montague semantics and modifier consistency measurement in neural language modelsDanilo S. Carvalho, Edoardo Manino, Julia Rozanova et al.
This work proposes a novel methodology for measuring compositional behavior in contemporary language embedding models. Specifically, we focus on adjectival modifier phenomena in adjective-noun phrases. In recent years, distributional language representation models have demonstrated great practical success. At the same time, the need for interpretability has elicited questions on their intrinsic properties and capabilities. Crucially, distributional models are often inconsistent when dealing with compositional phenomena in natural language, which has significant implications for their safety and fairness. Despite this, most current research on compositionality is directed towards improving their performance on similarity tasks only. This work takes a different approach, introducing three novel tests of compositional behavior inspired by Montague semantics. Our experimental results indicate that current neural language models do not behave according to the expected linguistic theories. This indicates that current language models may lack the capability to capture the semantic properties we evaluated on limited context, or that linguistic theories from Montagovian tradition may not match the expected capabilities of distributional models.
AINov 25, 2021Code
QNNVerifier: A Tool for Verifying Neural Networks using SMT-Based Model CheckingXidan Song, Edoardo Manino, Luiz Sena et al.
QNNVerifier is the first open-source tool for verifying implementations of neural networks that takes into account the finite word-length (i.e. quantization) of their operands. The novel support for quantization is achieved by employing state-of-the-art software model checking (SMC) techniques. It translates the implementation of neural networks to a decidable fragment of first-order logic based on satisfiability modulo theories (SMT). The effects of fixed- and floating-point operations are represented through direct implementations given a hardware-determined precision. Furthermore, QNNVerifier allows to specify bespoke safety properties and verify the resulting model with different verification strategies (incremental and k-induction) and SMT solvers. Finally, QNNVerifier is the first tool that combines invariant inference via interval analysis and discretization of non-linear activation functions to speed up the verification of neural networks by orders of magnitude. A video presentation of QNNVerifier is available at https://youtu.be/7jMgOL41zTY
GTJan 5, 2022
Privacy-Friendly Peer-to-Peer Energy Trading: A Game Theoretical ApproachKamil Erdayandi, Amrit Paudel, Lucas Cordeiro et al.
In this paper, we propose a decentralized, privacy-friendly energy trading platform (PFET) based on game theoretical approach - specifically Stackelberg competition. Unlike existing trading schemes, PFET provides a competitive market in which prices and demands are determined based on competition, and computations are performed in a decentralized manner which does not rely on trusted third parties. It uses homomorphic encryption cryptosystem to encrypt sensitive information of buyers and sellers such as sellers$'$ prices and buyers$'$ demands. Buyers calculate total demand on particular seller using an encrypted data and sensitive buyer profile data is hidden from sellers. Hence, privacy of both sellers and buyers is preserved. Through privacy analysis and performance evaluation, we show that PFET preserves users$'$ privacy in an efficient manner.
LGJun 10, 2021
Verifying Quantized Neural Networks using SMT-Based Model CheckingLuiz Sena, Xidan Song, Erickson Alves et al.
Artificial Neural Networks (ANNs) are being deployed for an increasing number of safety-critical applications, including autonomous cars and medical diagnosis. However, concerns about their reliability have been raised due to their black-box nature and apparent fragility to adversarial attacks. These concerns are amplified when ANNs are deployed on restricted system, which limit the precision of mathematical operations and thus introduce additional quantization errors. Here, we develop and evaluate a novel symbolic verification framework using software model checking (SMC) and satisfiability modulo theories (SMT) to check for vulnerabilities in ANNs. More specifically, we propose several ANN-related optimizations for SMC, including invariant inference via interval analysis, slicing, expression simplifications, and discretization of non-linear activation functions. With this verification framework, we can provide formal guarantees on the safe behavior of ANNs implemented both in floating- and fixed-point arithmetic. In this regard, our verification approach was able to verify and produce adversarial examples for $52$ test cases spanning image classification and general machine learning applications. Furthermore, for small- to medium-sized ANN, our approach completes most of its verification runs in minutes. Moreover, in contrast to most state-of-the-art methods, our approach is not restricted to specific choices regarding activation functions and non-quantized representations. Our experiments show that our approach can analyze larger ANN implementations and substantially reduce the verification time compared to state-of-the-art techniques that use SMT solving.
CRFeb 4, 2021
Verifying Security Vulnerabilities in Large Software Systems using Multi-Core k-InductionThales Silva, Carmina Porto, Erickson Alves et al.
Computer-based systems have been used to solve several domain problems, such as industrial, military, education, and wearable. Those systems need high-quality software to guarantee security and safety. We advocate that Bounded Model Checking (BMC) techniques can detect security vulnerabilities in the early stages of development processes. However, this technique struggles to scale up and verify large software commonly found on computer-based systems. Here, we develop and evaluate a pragmatic approach to verify large software systems using a state-of-the-art bounded model checker. In particular, we pre-process the input source-code files and then guide the model checker to explore the code systematically. We also present a multi-core implementation of the k-induction proof algorithm to verify and falsify large software systems iteratively. Our experimental results using the Efficient SMT-based Model Checker (ESBMC) show that our approach can guide ESBMC to efficiently verify large software systems. We evaluate our approach using the PuTTY application to verify 136 files and 2803 functions in less than 86 minutes, and the SlimGuard allocator, where we have found real security vulnerabilities confirmed by the developers. We conclude that our approach can successfully guide a bounded model checker to verify large software systems systematically.
SEDec 21, 2020
Incremental Symbolic Bounded Model Checking of Software Using Interval Methods via ContractorsMohannad Aldughaim, Kaled Alshmrany, Rafael Menezes et al.
Bounded model checking (BMC) is vital for finding program property violations. For unsafe programs, BMC can quickly find an execution path from an initial state to the violated state that refutes a given safety property. However, BMC techniques struggle to falsify programs that contain loops. BMC needs to incrementally unfold the program loops up to the bound $k$, exposing the property violation, which can thus lead to exploring a considerable state space. Here, we describe and evaluate the first verification method based on interval methods via contractors to reduce the domains of variables representing the search space. This reduction is based on the specified property modeled as functions representing the contractor constraints. In particular, we exploit interval methods via contractors to incrementally analyze the program loop variables and contract the domain where the property is guaranteed to hold to prune the search exploration, thus reducing resource consumption aggressively. Experimental results demonstrate the efficiency and efficacy of our proposed approach over a large set of benchmarks, including $7044$ verification tasks, compared with state-of-the-art BMC tools. Our proposed method can reduce memory usage up to $75$\% while verifying $1$\% more verification tasks.
LODec 21, 2020
Incremental Verification of Fixed-Point Implementations of Neural NetworksLuiz Sena, Erickson Alves, Iury Bessa et al.
Implementations of artificial neural networks (ANNs) might lead to failures, which are hardly predicted in the design phase since ANNs are highly parallel and their parameters are barely interpretable. Here, we develop and evaluate a novel symbolic verification framework using incremental bounded model checking (BMC), satisfiability modulo theories (SMT), and invariant inference, to obtain adversarial cases and validate coverage methods in a multi-layer perceptron (MLP). We exploit incremental BMC based on interval analysis to compute boundaries from a neuron's input. Then, the latter are propagated to effectively find a neuron's output since it is the input of the next one. This paper describes the first bit-precise symbolic verification framework to reason over actual implementations of ANNs in CUDA, based on invariant inference, therefore providing further guarantees about finite-precision arithmetic and its rounding errors, which are routinely ignored in the existing literature. We have implemented the proposed approach on top of the efficient SMT-based bounded model checker (ESBMC), and its experimental results show that it can successfully verify safety properties, in actual implementations of ANNs, and generate real adversarial cases in MLPs. Our approach was able to verify and produce adversarial examples for 85.8% of 21 test cases considering different input images, and 100% of the properties related to covering methods. Although our verification time is higher than existing approaches, our methodology can consider fixed-point implementation aspects that are disregarded by the state-of-the-art verification methodologies.
CRJan 27, 2020
Finding Security Vulnerabilities in Network Protocol ImplementationsKaled Alshmrany, Lucas Cordeiro
Implementations of network protocols are often prone to vulnerabilities caused by developers' mistakes when accessing memory regions and dealing with arithmetic operations. Finding practical approaches for checking the security of network protocol implementations has proven to be a challenging problem. The main reason is that the protocol software state-space is too large to be explored. Here we propose a novel verification approach that combines fuzzing with symbolic execution to verify intricate properties in network protocol implementations. We use fuzzing for an initial exploration of the network protocol, while symbolic execution explores both the program paths and protocol states, which were uncovered by fuzzing. From this combination, we automatically generate high-coverage test input packets for a network protocol implementation. We surveyed various approaches based on fuzzing and symbolic execution to understand how these techniques can be effectively combined and then choose a suitable tool to develop further our model on top of it. In our preliminary evaluation, we used ESBMC, Map2Check, and KLEE as software verifiers and SPIKE as fuzzer to check their suitability to verify our network protocol implementations. Our experimental results show that ESBMC can be further developed within our verification framework called \textit{FuSeBMC}, to efficiently and effectively detect intricate security vulnerabilities in network protocol implementations.
LOSep 11, 2018
Benchmarking of Java Verification Tools at the Software Verification Competition (SV-COMP)Lucas Cordeiro, Daniel Kroening, Peter Schrammel
Empirical evaluation of verification tools by benchmarking is a common method in software verification research. The Competition on Software Verification (SV-COMP) aims at standardization and reproducibility of benchmarking within the software verification community on an annual basis, through comparative evaluation of fully automatic software verifiers for C programs. Building upon this success, here we describe how to re-use the ecosystem developed around SV-COMP for benchmarking Java verification tools. We provide a detailed description of the rules for benchmark verification tasks, the integration of new tools into SV-COMP's benchmarking framework and also give experimental results of a benchmarking run on state-of-the-art Java verification tools, JPF, SPF, JayHorn and JBMC.
SYJun 9, 2017
DSValidator: An Automated Counterexample Reproducibility Tool for Digital Systems (Tool Demonstration)Lennon Chaves, Iury Bessa, Lucas Cordeiro et al.
An automated counterexample reproducibility tool based on MATLAB is presented, called DSValidator, with the goal of reproducing counterexamples that refute specific properties related to digital systems. We exploit counterexamples generated by the Digital System Verifier (DSVerifier), which is a model checking tool based on satisfiability modulo theories for digital systems. DSValidator reproduces the execution of a digital system, relating its input with the counterexample, in order to establish trust in a verification result. We show that DSValidator can validate a set of intricate counterexamples for digital controllers used in a real quadrotor attitude system within seconds and also expose incorrect verification results in DSVerifier. The resulting toolbox leverages the potential of combining different verification tools for validating digital systems via an exchangeable counterexample format.
LOSep 8, 2015
Applying Multi-Core Model Checking to Hardware-Software Partitioning in Embedded Systems (extended version)Alessandro Trindade, Hussama Ismail, Lucas Cordeiro
We present an alternative approach to solve the hardware (HW) and software (SW) partitioning problem, which uses Bounded Model Checking (BMC) based on Satisfiability Modulo Theories (SMT) in conjunction with a multi-core support using Open Multi-Processing. The multi-core SMT-based BMC approach allows initializing many verification instances based on processors cores numbers available to the model checker. Each instance checks for a different optimum value until the optimization problem is satisfied. The goal is to show that multi-core model-checking techniques can be effective, in particular cases, to find the optimal solution of the HW-SW partitioning problem using an SMT-based BMC approach. We compare the experimental results of our proposed approach with Integer Linear Programming and the Genetic Algorithm.
LOSep 8, 2015
Model Checking Embedded C Software using k-Induction and Invariants (extended version)Herbert Rocha, Hussama Ismail, Lucas Cordeiro et al.
We present a proof by induction algorithm, which combines k-induction with invariants to model check embedded C software with bounded and unbounded loops. The k-induction algorithm consists of three cases: in the base case, we aim to find a counterexample with up to k loop unwindings; in the forward condition, we check whether loops have been fully unrolled and that the safety property P holds in all states reachable within k unwindings; and in the inductive step, we check that whenever P holds for k unwindings, it also holds after the next unwinding of the system. For each step of the k-induction algorithm, we infer invariants using affine constraints (i.e., polyhedral) to specify pre- and post-conditions. Experimental results show that our approach can handle a wide variety of safety properties in typical embedded software applications from telecommunications, control systems, and medical devices; we demonstrate an improvement of the induction algorithm effectiveness if compared to other approaches.
LOFeb 9, 2015
Model Checking C Programs with Loops via k-Induction and InvariantsHerbert Rocha, Hussama Ismail, Lucas Cordeiro et al.
We present a novel proof by induction algorithm, which combines k-induction with invariants to model check C programs with bounded and unbounded loops. The k-induction algorithm consists of three cases: in the base case, we aim to find a counterexample with up to k loop unwindings; in the forward condition, we check whether loops have been fully unrolled and that the safety property P holds in all states reachable within k unwindings; and in the inductive step, we check that whenever P holds for k unwindings, it also holds after the next unwinding of the system. For each step of the k-induction algorithm, we infer invariants using affine constraints (i.e., polyhedral) to specify pre- and post-conditions. The algorithm was implemented in two different ways, with and without invariants using polyhedral, and the results were compared. Experimental results show that both forms can handle a wide variety of safety properties; however, the k-induction algorithm adopting polyhedral solves more verification tasks, which demonstrate an improvement of the induction algorithm effectiveness.
SYMar 20, 2014
SMT-Based Bounded Model Checking of Fixed-Point Digital ControllersIury Bessa, Renato Abreu, João Edgar Filho et al.
Digital controllers have several advantages with respect to their flexibility and design's simplicity. However, they are subject to problems that are not faced by analog controllers. In particular, these problems are related to the finite word-length implementation that might lead to overflows, limit cycles, and time constraints in fixed-point processors. This paper proposes a new method to detect design's errors in digital controllers using a state-of-the art bounded model checker based on satisfiability modulo theories. The experiments with digital controllers for a ball and beam plant demonstrate that the proposed method can be very effective in finding errors in digital controllers than other existing approaches based on traditional simulations tools.
SEMay 13, 2013
Verifying Fixed-Point Digital Filters using SMT-Based Bounded Model CheckingRenato B. Abreu, Lucas Cordeiro, Eddie B. L. Filho
The implementation of digital filters in processors based on fixed-point arithmetic can lead to problems related to the finite word-length. In particular, the processing of signals in such filters can produce overflows and unwanted noise caused by quantization and round off effect during the accumulative addition and multiplication operations. In this paper, we describe a new approach to verify digital filters using an off-the-shelf bounded model checker called ESBMC, which supports full C/C++ and is based on satisfiability modulo theories solvers. In particular, we are able to verify the occurrence of overflows, limit cycles, and time constraints based on a discrete-time model implemented in C. The experiments show that the proposed approach can be used to verify potential problems in fixed-point implementation of digital filters and it can thus be effective in finding realistic design errors.