7.9SEApr 14
Short Version of VERIFAI2026 Paper -- Learning Infused Formal Reasoning: Contract Synthesis, Artefact Reuse and Semantic FoundationsArshad Beg, Diarmuid O'Donoghue, Rosemary Monahan
Artificial intelligence systems have achieved remarkable capability in natural language processing, perception and decision-making tasks. However, their behaviour often remains opaque and difficult to verify, limiting their applicability in safety-critical systems. Formal methods provide mathematically rigorous mechanisms for specifying and verifying system behaviour, yet the creation and maintenance of formal specifications remains labour intensive and difficult to scale. This paper outlines a research vision called Learning-Infused Formal Reasoning (LIFR), which integrates machine learning techniques with formal verification workflows. The framework focuses on three complementary research directions: automated contract synthesis from natural language requirements, semantic reuse of verification artifacts using graph matching and learning-based embeddings, and mathematically grounded semantic foundations based on the Unifying Theories of Programming (UTP) and the Theory of Institutions. Together these research threads aim to transform verification from isolated correctness proofs into a cumulative knowledge-driven process where specifications, contracts and proofs can be synthesised, aligned and reused across systems.
LONov 16, 2023
Comparing Differentiable Logics for Learning Systems: A Research PreviewThomas Flinkow, Barak A. Pearlmutter, Rosemary Monahan
Extensive research on formal verification of machine learning (ML) systems indicates that learning from data alone often fails to capture underlying background knowledge. A variety of verifiers have been developed to ensure that a machine-learnt model satisfies correctness and safety properties, however, these verifiers typically assume a trained network with fixed weights. ML-enabled autonomous systems are required to not only detect incorrect predictions, but should also possess the ability to self-correct, continuously improving and adapting. A promising approach for creating ML models that inherently satisfy constraints is to encode background knowledge as logical constraints that guide the learning process via so-called differentiable logics. In this research preview, we compare and evaluate various logics from the literature in weakly-supervised contexts, presenting our findings and highlighting open problems for future work. Our experimental results are broadly consistent with results reported previously in literature; however, learning with differentiable logics introduces a new hyperparameter that is difficult to tune and has significant influence on the effectiveness of the logics.
10.1LOMay 13
Quantitative Linear Logic for Neuro-Symbolic Learning and VerificationThomas Flinkow, Ekaterina Komendantskaya, Matteo Capucci et al.
Differentiable Logics are deployed in neuro-symbolic learning tasks as a way of embedding logical constraints in the training objective of neural networks. A differentiable logic consists of a syntax to write logical properties and a semantics to interpret them as real-valued functions to be folded in the loss function. A defining trade-off of the field is that between logical properties of the connectives, and analytic concerns for the semantics, with both aspects being relevant in applications. At one extreme we find fuzzy logics, that have well-established algebraic and proof-theoretic foundations, and at the other ad-hoc differentiable logics like Fischer's DL2, conceived for deep learning applications. However, no satisfactory foundation has emerged yet. We propose a resolution to this long-standing tension via a novel logic, Quantitative Linear Logic (QLL), with foundational ambitions. Our design is driven by naturality -- the idea that, since logical constraints are translated to losses, the semantics of the connectives should be pertinent operations used in ML practice (that is, sum and log-sum-exp) on additive quantities (like logits). We then judge the result on two aspects: logical adequacy -- that they satisfy most of the standard logical laws of Linear Logic; and empirical effectiveness -- test-time performance (as measured by adversarial attacks) is well-correlated to the actual verification of the logical constraints (as measured by off-the-shelf neural network verifiers), which makes QLL stand out among SoTA techniques.
LGMay 1, 2025Code
A General Framework for Property-Driven Machine LearningThomas Flinkow, Marco Casadio, Colin Kessler et al.
Neural networks have been shown to frequently fail to learn critical safety and correctness properties purely from data, highlighting the need for training methods that directly integrate logical specifications. While adversarial training can be used to improve robustness to small perturbations within $ε$-cubes, domains other than computer vision -- such as control systems and natural language processing -- may require more flexible input region specifications via generalised hyper-rectangles. Differentiable logics offer a way to encode arbitrary logical constraints as additional loss terms that guide the learning process towards satisfying these constraints. In this paper, we investigate how these two complementary approaches can be unified within a single framework for property-driven machine learning, as a step toward effective formal verification of neural networks. We show that well-known properties from the literature are subcases of this general approach, and we demonstrate its practical effectiveness on a case study involving a neural network controller for a drone system. Our framework is made publicly available at https://github.com/tflinkow/property-driven-ml.
12.1SEApr 1
Evaluating LLM-Generated ACSL Annotations for Formal VerificationArshad Beg, Diarmuid O'Donoghue, Rosemary Monahan
Formal specifications are crucial for building verifiable and dependable software systems, yet generating accurate and verifiable specifications for real-world C programs remains challenging. This paper empirically evaluates the extent to which formal-analysis tools can automatically generate and verify ACSL specifications without human or learning-based assistance. We conduct a controlled study on a recently released dataset of 506 C programs, repurposing it from interactive, developer-driven workflows to an automated evaluation setting. Five ACSL generation systems are compared: a rule-based Python script, Frama-C's RTE plugin, and three large language models--DeepSeek-V3.2, GPT-5.2, and OLMo 3.1 32B Instruct. All generated specifications are verified under identical conditions using the Frama-C WP plugin powered by multiple SMT solvers, allowing a direct comparison of annotation quality, solver sensitivity, and proof stability. Our results provide new empirical evidence on the capabilities and limitations of automated ACSL generation, complementing prior survey-based work.
15.4SEApr 29
Graph Construction and Matching for Imperative Programs using Neural and Structural MethodsArshad Beg, Diarmuid O'Donoghue, Rosemary Monahan
Reusing verification artefacts requires identifying structural and semantic similarities across programs and their specifications. In this paper, we focus on graph construction as a foundational step toward this goal. We present a pipeline that converts imperative programs and their annotations into typed, attributed graphs. Our experiments cover datasets including C with ACSL, Java with JML, and Dafny for C\#. The pipeline integrates abstract syntax tree parsing with semantic embeddings derived from models such as SentenceTransformer and CodeBERT. This enables the generation of graph representations that capture both structural relationships and semantic context. Our results show that consistent graph representations can be constructed across different languages and annotation styles. This work provides a practical basis for future steps in semantic enrichment and approximate graph matching for scalable verification artefact reuse.
SEJun 12, 2025
Formalising Software Requirements using Large Language ModelsArshad Beg, Diarmuid O'Donoghue, Rosemary Monahan
This paper is a brief introduction to our recently initiated project named VERIFAI: Traceability and verification of natural language requirements. The project addresses the challenges in the traceability and verification of formal specifications through providing support for the automatic generation of the formal specifications and the traceability of the requirements from the initial software design stage through the systems implementation and verification. Approaches explored in this project include Natural Language Processing, use of ontologies to describe the software system domain, reuse of existing software artefacts from similar systems (i.e. through similarity based reuse) and large language models to identify and declare the specifications as well as use of artificial intelligence to guide the process.
SEFeb 2
Learning-Infused Formal Reasoning: From Contract Synthesis to Artifact Reuse and Formal SemanticsArshad Beg, Diarmuid O'Donoghue, Rosemary Monahan
This vision paper articulates a long-term research agenda for formal methods at the intersection with artificial intelligence, outlining multiple conceptual and technical dimensions and reporting on our ongoing work toward realising this agenda. It advances a forward-looking perspective on the next generation of formal methods based on the integration of automated contract synthesis, semantic artifact reuse, and refinement-based theory. We argue that future verification systems must move beyond isolated correctness proofs toward a cumulative, knowledge-driven paradigm in which specifications, contracts, and proofs are continuously synthesised and transferred across systems. To support this shift, we outline a hybrid framework combining large language models with graph-based representations to enable scalable semantic matching and principled reuse of verification artifacts. Learning-based components provide semantic guidance across heterogeneous notations and abstraction levels, while symbolic matching ensures formal soundness. Grounded in compositional reasoning, this vision points toward verification ecosystems that evolve systematically, leveraging past verification efforts to accelerate future assurance.
SEJun 17, 2025
Working Document -- Formalising Software Requirements with Large Language ModelsArshad Beg, Diarmuid O'Donoghue, Rosemary Monahan
This draft is a working document, having a summary of nighty-four (94) papers with additional sections on Traceability of Software Requirements (Section 4), Formal Methods and Its Tools (Section 5), Unifying Theories of Programming (UTP) and Theory of Institutions (Section 6). Please refer to abstract of [7,8]. Key difference of this draft from our recently anticipated ones with similar titles, i.e. AACS 2025 [7] and SAIV 2025 [8] is: [7] is a two page submission to ADAPT Annual Conference, Ireland. Submitted on 18th of March, 2025, it went through the light-weight blind review and accepted for poster presentation. Conference was held on 15th of May, 2025; [8] is a nine page paper with additional nine pages of references and summary tables, submitted to Symposium on AI Verification (SAIV 2025) on 24th of April, 2025. It went through rigorous review process. The uploaded version on arXiv.org [8] is the improved one of the submission, after addressing the specific suggestions to improve the paper.
LONov 21, 2024
Creating a Formally Verified Neural Network for Autonomous Navigation: An Experience ReportSyed Ali Asadullah Bukhari, Thomas Flinkow, Medet Inkarbekov et al.
The increased reliance of self-driving vehicles on neural networks opens up the challenge of their verification. In this paper we present an experience report, describing a case study which we undertook to explore the design and training of a neural network on a custom dataset for vision-based autonomous navigation. We are particularly interested in the use of machine learning with differentiable logics to obtain networks satisfying basic safety properties by design, guaranteeing the behaviour of the neural network after training. We motivate the choice of a suitable neural network verifier for our purposes and report our observations on the use of neural network verifiers for self-driving systems.
SEJan 12, 2022
Towards Refactoring FRETish RequirementsMarie Farrell, Matt Luckcuck, Oisin Sheridan et al.
Like software, requirements evolve and change frequently during the development process. Refactoring is the process of reorganising software without changing its behaviour, to make it easier to understand and modify. We propose refactoring for formalised requirements to reduce repetition in the requirement set so that they are easier to maintain as the system and requirements evolve. This work-in-progress paper describes our motivation for and initial approach to refactoring requirements in NASA's Formal Requirements Elicitation Tool (FRET). This work was directly triggered by our experience with an industrial aircraft engine software controller use case. In this paper, we reflect on the requirements that were obtained and, with a view to their maintainability, propose and outline functionality for refactoring FRETISH requirements.
SEDec 8, 2021
FRETting about Requirements: Formalised Requirements for an Aircraft Engine ControllerMarie Farrell, Matt Luckcuck, Oisin Sheridan et al.
[Context & motivation] Eliciting requirements that are detailed and logical enough to be amenable to formal verification is a difficult task. Multiple tools exist for requirements elicitation and some of these also support formalisation of requirements in a way that is useful for formal methods. [Question/problem] This paper reports on our experience of using the FRET alongside our industrial partner. The use case that we investigate is an aircraft engine controller. In this context, we evaluate the use of FRET to bridge the communication gap between formal methods experts and aerospace industry specialists. [Principal ideas/results] We describe our journey from ambiguous, natural-language requirements to concise, formalised FRET requirements. We include our analysis of the formalised requirements from the perspective of patterns, translation into other formal methods and the relationship between parent-child requirements in this set. We also provide insight into lessons learned throughout this process and identify future improvements to FRET. [Contribution] Previous experience reports have been published by the FRET team, but this is the first such report of an industrial use case that was written by researchers that have not been involved FRET's development.
SEOct 18, 2021
A Methodology for Developing a Verifiable Aircraft Engine Controller from Formal RequirementsMatt Luckcuck, Marie Farrell, Oisín Sheridan et al.
Verification of complex, safety-critical systems is a significant challenge. Manual testing and simulations are often used, but are only capable of exploring a subset of the system's reachable states. Formal methods are mathematically-based techniques for the specification and development of software, which can provide proofs of properties and exhaustive checks over a system's state space. In this paper, we present a formal requirements-driven methodology, applied to a model of an aircraft engine controller that has been provided by our industrial partner. Our methodology begins by formalising the controller's natural-language requirements using the (pre-existing) Formal Requirements Elicitation Tool (FRET), iteratively, in consultation with our industry partner. Once formalised, FRET can automatically translate the requirements to enable their verification alongside a Simulink model of the aircraft engine controller; the requirements can also guide formal verification using other approaches. These two parallel streams in our methodology seek to combine the results from formal requirements elicitation, classical verification approaches, and runtime verification; to support the verification of aerospace systems modelled in Simulink, from the requirements phase through to execution. Our methodology harnesses the power of formal methods in a way that complements existing verification techniques, and supports the traceability of requirements throughout the verification process. This methodology streamlines the process of developing verifiable aircraft engine controllers, by ensuring that the requirements are formalised up-front and useable during development. In this paper we give an overview of (FRET), describe our methodology and work to-date on the formalisation and verification of the requirements, and outline future work using our methodology.
SEDec 20, 2019
Proceedings Fifth Workshop on Formal Integrated Development EnvironmentRosemary Monahan, Virgile Prevosto, Jose Proença
This volume contains the proceedings of F-IDE 2019, the fifth international workshop on Formal Integrated Development Environment, which was held on October 7, 2019 in Porto, Portugal, as part of FM'19, the 3rd World Congress on Formal Methods. High levels of safety, security and privacy standards require the use of formal methods to specify and develop compliant software (sub)systems. Any standard comes with an assessment process, which requires a complete documentation of the application in order to ease the justification of design choices and the review of code and proofs. Thus tools are needed for handling specifications, program constructs and verification artifacts. The aim of the F-IDE workshop is to provide a forum for presenting and discussing research efforts as well as experience returns on design, development and usage of formal IDE aiming at making formal methods "easier" for both specialists and non-specialists.
PLNov 22, 2018
Proceedings 4th Workshop on Formal Integrated Development EnvironmentPaolo Masci, Rosemary Monahan, Virgile Prevosto
This volume contains the proceedings of F-IDE 2018, the fourth international workshop on Formal Integrated Development Environment, which was held as a FLoC 2018 satellite event, on July 14, 2018, in Oxford, England. High levels of safety, security and also privacy standards require the use of formal methods to specify and develop compliant software (sub)systems. Any standard comes with an assessment process, which requires a complete documentation of the application in order to ease the justification of design choices and the review of code and proofs. Thus tools are needed for handling specifications, program constructs and verification artifacts. The aim of the F-IDE workshop is to provide a forum for presenting and discussing research efforts as well as experience returns on design, development and usage of formal IDE aiming at making formal methods "easier" for both specialists and non-specialists.
SEJan 30, 2017
Predicting SMT Solver Performance for Software VerificationAndrew Healy, Rosemary Monahan, James F. Power
The Why3 IDE and verification system facilitates the use of a wide range of Satisfiability Modulo Theories (SMT) solvers through a driver-based architecture. We present Where4: a portfolio-based approach to discharge Why3 proof obligations. We use data analysis and machine learning techniques on static metrics derived from program source code. Our approach benefits software engineers by providing a single utility to delegate proof obligations to the solvers most likely to return a useful result. It does this in a time-efficient way using existing Why3 and solver installations - without requiring low-level knowledge about SMT solver operation from the user.