SEJul 21, 2025
Do AI models help produce verified bug fixes?Li Huang, Ilgiz Mustafin, Marco Piccioni et al.
Among areas of software engineering where AI techniques -- particularly, Large Language Models -- seem poised to yield dramatic improvements, an attractive candidate is Automatic Program Repair (APR), the production of satisfactory corrections to software bugs. Does this expectation materialize in practice? How do we find out, making sure that proposed corrections actually work? If programmers have access to LLMs, how do they actually use them to complement their own skills? To answer these questions, we took advantage of the availability of a program-proving environment, which formally determines the correctness of proposed fixes, to conduct a study of program debugging with two randomly assigned groups of programmers, one with access to LLMs and the other without, both validating their answers through the proof tools. The methodology relied on a division into general research questions (Goals in the Goal-Query-Metric approach), specific elements admitting specific answers (Queries), and measurements supporting these answers (Metrics). While applied so far to a limited sample size, the results are a first step towards delineating a proper role for AI and LLMs in providing guaranteed-correct fixes to program bugs. These results caused surprise as compared to what one might expect from the use of AI for debugging and APR. The contributions also include: a detailed methodology for experiments in the use of LLMs for debugging, which other projects can reuse; a fine-grain analysis of programmer behavior, made possible by the use of full-session recording; a definition of patterns of use of LLMs, with 7 distinct categories; and validated advice for getting the best of LLMs for debugging and Automatic Program Repair.
SENov 28, 2025
AI for software engineering: from probable to provableBertrand Meyer
Vibe coding, the much-touted use of AI techniques for programming, faces two overwhelming obstacles: the difficulty of specifying goals ("prompt engineering" is a form of requirements engineering, one of the toughest disciplines of software engineering); and the hallucination phenomenon. Programs are only useful if they are correct or very close to correct. The solution? Combine the creativity of artificial intelligence with the rigor of formal specification methods and the power of formal program verification, supported by modern proof tools.
PLSep 14, 2021
The concept of class invariant in object-oriented programmingBertrand Meyer, Alisa Arkadova, Alexander Kogtenkov
Class invariants -- consistency constraints preserved by every operation on objects of a given type -- are fundamental to building, understanding and verifying object-oriented programs. For verification, however, they raise difficulties, which have not yet received a generally accepted solution. The present work introduces a proof rule meant to address these issues and allow verification tools to benefit from invariants. It clarifies the notion of invariant and identifies the three associated problems: callbacks, furtive access and reference leak. As an example, the 2016 Ethereum DAO bug, in which $50 million were stolen, resulted from a callback invalidating an invariant. The discussion starts with a simplified model of computation and an associated proof rule, demonstrating its soundness. It then removes one by one the three simplifying assumptions, each removal raising one of the three issues, and leading to a corresponding adaptation to the proof rule. The final version of the rule can tackle tricky examples, including "challenge problems" listed in the literature.
SENov 6, 2019
The role of formalism in system requirements (full version)Jean-Michel Bruel, Sophie Ebersold, Florian Galinier et al.
A major determinant of the quality of software systems is the quality of their requirements, which should be both understandable and precise. Most requirements are written in natural language, good for understandability but lacking in precision. To make requirements precise, researchers have for years advocated the use of mathematics-based notations and methods, known as "formal". Many exist, differing in their style, scope and applicability. The present survey discusses some of the main formal approaches and compares them to informal methods. The analysis uses a set of 9 complementary criteria, such as level of abstraction, tool availability, traceability support. It classifies the approaches into five categories: general-purpose, natural-language, graph/automata, other mathematical notations, seamless (programming-language-based). It presents approaches in all of these categories, altogether 22 different ones, including for example SysML, Relax, Eiffel, Event-B, Alloy. The review discusses a number of open questions, including seamlessness, the role of tools and education, and how to make industrial applications benefit more from the contributions of formal approaches. (This is the full version of the survey, including some sections and two appendices which, because of length restrictions, do not appear in the submitted version.)
SEJun 15, 2019
The Anatomy of RequirementsBertrand Meyer, Jean-Michel Bruel, Sophie Ebersold et al.
Requirements engineering is crucial to software development but lacks a precise definition of its fundamental concepts. Even the basic definitions in the literature and in industry standards are often vague and verbose. To remedy this situation and provide a solid basis for discussions of requirements, this work provides precise definitions of the fundamental requirements concepts and two systematic classifications: a taxonomy of requirement elements (such as components, goals, constraints...) ; and a taxonomy of possible relations between these elements (such as "extends", "excepts", "belongs"...). The discussion evaluates the taxonomies on published requirements documents; readers can test the concepts in two online quizzes. The intended result of this work is to spur new advances in the study and practice of software requirements by clarifying the fundamental concepts.
SEAug 27, 2018
AutoFrame: Automatic Frame Inference for Object-Oriented LanguagesVictor Rivera, Bertrand Meyer
Automatic program verification has made tremendous strides, but is not yet for the masses. How do we make it less painful? This article addresses one of the obstacles: the need to specify explicit "frame clauses", expressing what properties are left unchanged by an operation. It is fair enough to ask the would-be (human) prover to state what each operation changes, and how, but the (mechanical) prover also requires knowledge of what it does not change. The process of specifying and verifying these properties is tedious and error-prone, and must be repeated whenever the software evolves. it is also hard to justify, since all the information about what the code changes is in the code itself. The AutoFrame tool presented here performs this analysis entirely automatically. It applies to object-oriented programming, where the issue is compounded by aliasing: if x is aliased to y, any update to x.a also affects y.a, even though the updating instruction usually does not even mention y. This aspect turns out to be the most delicate, and is addressed in AutoFrame by taking advantage of a companion tool, AutoAlias, which performs sound and sufficiently precise alias analysis, also in an entirely automatic way. Some practical results of AutoFrame so far are: (1) the automatic reconstruction (in about 25 seconds on an ordinary laptop) of the exact frame clauses, a total of 169 clauses, for an 8,000-line data structures and algorithms library which was previously (with the manually written frame clauses) verified for functional correctness using a mechanical program prover; and (2) the automatic generation (in less than 4 minutes) of frame conditions for a 150,000-line graphical and GUI library. The source code of AutoFrame and these examples are available for download.
SEAug 27, 2018
AutoAlias: Automatic Variable-Precision Alias Analysis for Object-Oriented ProgramsVictor Rivera, Bertrand Meyer
The aliasing question (can two reference expressions point, during an execution, to the same object?) is both one of the most critical in practice, for applications ranging from compiler optimization to programmer verification, and one of the most heavily researched, with many hundreds of publications over several decades. One might then expect that good off-the-shelf solutions are widely available, ready to be plugged into a compiler or verifier. This is not the case. In practice, efficient and precise alias analysis remains an open problem. We present a practical tool, AutoAlias, which can be used to perform automatic alias analysis for object-oriented programs. Based on the theory of "duality semantics", an application of Abstract Interpretation ideas, it is directed at object-oriented languages and has been implemented for Eiffel as an addition to the EiffelStudio environment. It offers variable-precision analysis, controllable through the choice of a constant that governs the number of fix point iterations: a higher number means better precision and higher computation time. All the source code of AutoAlias, as well as detailed results of analyses reported in this article, are publicly available. Practical applications so far have covered a library of data structures and algorithms and a library for GUI creation. For the former, AutoAlias achieves a precision appropriate for practical purposes and execution times in the order of 25 seconds for about 8000 lines of intricate code. For the GUI library, AutoAlias produces the alias analysis in around 232 seconds for about 150000 lines of intricate code.
SEDec 14, 2017
Fourteen Years of Software Engineering at ETH ZurichBertrand Meyer
A Chair of Software Engineering existed at ETH Zurich, the Swiss Federal Insti-tute of Technology, from 1 October 2001 to 31 January 2016, under my leader-ship. Our work, summarized here, covered a wide range of theoretical and practi-cal topics, with object technology in the Eiffel method as the unifying thread .
SEOct 8, 2017
AutoReq: expressing and verifying requirements for control systemsAlexandr Naumchev, Bertrand Meyer, Manuel Mazzara et al.
The considerable effort of writing requirements is only worthwhile if the result meets two conditions: the requirements reflect stakeholders' needs, and the implementation satisfies them. In usual approaches, the use of different notations for requirements (often natural language) and implementations (a programming language) makes both conditions elusive. AutoReq, presented in this article, takes a different approach to both the writing of requirements and their verification. Applying the approach to a well-documented example, a landing gear system, allowed for a mechanical proof of consistency and uncovered an error in a published discussion of the problem.
SEApr 17, 2017
A contract-based method to specify stimulus-response requirementsAlexandr Naumchev, Manuel Mazzara, Bertrand Meyer et al.
A number of formal methods exist for capturing stimulus-response requirements in a declarative form. Someone yet needs to translate the resulting declarative statements into imperative programs. The present article describes a method for specification and verification of stimulus-response requirements in the form of imperative program routines with conditionals and assertions. A program prover then checks a candidate program directly against the stated requirements. The article illustrates the approach by applying it to an ASM model of the Landing Gear System, a widely used realistic example proposed for evaluating specification and verification techniques.
SEApr 13, 2017
Seamless RequirementsAlexandr Naumchev, Bertrand Meyer
Popular notations for functional requirements specifications frequently ignore developers' needs, target specific development models, or require translation of requirements into tests for verification; the results can give out-of-sync or downright incompatible artifacts. Seamless Requirements, a new approach to specifying functional requirements, contributes to developers' understanding of requirements and to software quality regardless of the process, while the process itself becomes lighter due to the absence of tests in the presence of formal verification. A development case illustrates these benefits, and a discussion compares seamless requirements to other approaches.
SEAug 27, 2016
Class Invariants: Concepts, Problems, SolutionsBertrand Meyer
Class invariants are both a core concept of object-oriented programming and the source of the two key open OO verification problems: furtive access (from callbacks) and reference leak. Existing approaches force on programmers an unacceptable annotation burden. This article explains invariants and solves both problems modularly through the O-rule, defining fundamental OO semantics, and the inhibition rule, using information hiding to remove harmful reference leaks. It also introduces the concept of "object tribe" as a basis for other possible approaches. For all readers: this article is long because it includes a tutorial, covers many examples and dispels misconceptions. To understand the key ideas and results, however, the first two pages suffice. For non-experts in verification: all concepts are explained; anyone with a basic understanding of object-oriented programming can understand the discussion. For experts: the main limitation of this work is that it is a paper proposal (no soundness proof, no implementation). It addresses, however, the known problems with class invariants, solving such examples as linked lists and Observer, through a simple theory and without any of the following: ownership; separation logic; universe types; object wrapping and unwrapping; semantic collaboration, observer specifications; history invariants; "inc" and "coop" constructs; friendship construct; non-modular reasoning. More generally, it involves no new language construct and no new programmer annotations.
DCApr 15, 2016
An Interference-Free Programming Model for Network ObjectsMischael Schill, Christopher M. Poskitt, Bertrand Meyer
Network objects are a simple and natural abstraction for distributed object-oriented programming. Languages that support network objects, however, often leave synchronization to the user, along with its associated pitfalls, such as data races and the possibility of failure. In this paper, we present D-SCOOP, a distributed programming model that allows for interference-free and transaction-like reasoning on (potentially multiple) network objects, with synchronization handled automatically, and network failures managed by a compensation mechanism. We achieve this by leveraging the runtime semantics of a multi-threaded object-oriented concurrency model, directly generalizing it with a message-based protocol for efficiently coordinating remote objects. We present our pathway to fusing these contrasting but complementary ideas, and evaluate the performance overhead of the automatic synchronization in D-SCOOP, finding that it comes close to---or outperforms---explicit locking-based synchronization in Java RMI.
SEFeb 17, 2016
Unifying Requirements and Code: an ExampleAlexandr Naumchev, Bertrand Meyer, Victor Rivera
Requirements and code, in conventional software engineering wisdom, belong to entirely different worlds. Is it possible to unify these two worlds? A unified framework could help make software easier to change and reuse. To explore the feasibility of such an approach, the case study reported here takes a classic example from the requirements engineering literature and describes it using a programming language framework to express both domain and machine properties. The paper describes the solution, discusses its benefits and limitations, and assesses its scalability.
SEFeb 12, 2016
Complete contracts through specification driversAlexandr Naumchev, Bertrand Meyer
Existing techniques of Design by Contract do not allow software developers to specify complete contracts in many cases. Incomplete contracts leave room for malicious implementations. This article complements Design by Contract with a simple yet powerful technique that removes the problem without adding syntactical mechanisms. The proposed technique makes it possible not only to derive complete contracts, but also to rigorously check and improve completeness of existing contracts without instrumenting them.
PLJul 2, 2015
Theory of ProgramsBertrand Meyer
A general theory of programs, programming and programming languages built up from a few concepts of elementary set theory. Derives, as theorems, properties treated as axioms by classic approaches to programming. Covers sequential and concurrent computation.
SEApr 27, 2015
On the Verification of SCOOP ProgramsGeorgiana Caltais, Bertrand Meyer
In this paper we focus on the development of a toolbox for the verification of programs in the context of SCOOP -- an elegant concurrency model, recently formalized based on Rewriting Logic (RL) and Maude. SCOOP is implemented in Eiffel and its applicability is demonstrated also from a practical perspective, in the area of robotics programming. Our contribution consists in devising and integrating an alias analyzer and a Coffman deadlock detector under the roof of the same RL-based semantic framework of SCOOP. This enables using the Maude rewriting engine and its LTL model-checker "for free", in order to perform the analyses of interest. We discuss the limitations of our approach for model-checking deadlocks and provide solutions to the state explosion problem. The latter is mainly caused by the size of the SCOOP formalization which incorporates all the aspects of a real concurrency model. On the aliasing side, we propose an extension of a previously introduced alias calculus based on program expressions, to the setting of unbounded program executions such as infinite loops and recursive calls. Moreover, we devise a corresponding executable specification easily implementable on top of the SCOOP formalization. An important property of our extension is that, in non-concurrent settings, the corresponding alias expressions can be over-approximated in terms of a notion of regular expressions. This further enables us to derive an algorithm that always stops and provides a sound over-approximation of the "may aliasing" information, where soundness stands for the lack of false negatives.
DCOct 24, 2014
Contract-Based General-Purpose GPU ProgrammingAlexey Kolesnichenko, Christopher M. Poskitt, Sebastian Nanz et al.
Using GPUs as general-purpose processors has revolutionized parallel computing by offering, for a large and growing set of algorithms, massive data-parallelization on desktop machines. An obstacle to widespread adoption, however, is the difficulty of programming them and the low-level control of the hardware required to achieve good performance. This paper suggests a programming library, SafeGPU, that aims at striking a balance between programmer productivity and performance, by making GPU data-parallel operations accessible from within a classical object-oriented programming language. The solution is integrated with the design-by-contract approach, which increases confidence in functional program correctness by embedding executable program specifications into the program text. We show that our library leads to modular and maintainable code that is accessible to GPGPU non-experts, while providing performance that is comparable with hand-written CUDA code. Furthermore, runtime contract checking turns out to be feasible, as the contracts can be executed on the GPU.
DCJul 4, 2014
Dynamic Checking of Safe Concurrent Memory Access using Shared OwnershipMischael Schill, Sebastian Nanz, Bertrand Meyer
In shared-memory concurrent programming, shared resources can be protected using synchronization mechanisms such as monitors or channels. The connection between these mechanisms and the resources they protect is, however, only given implicitly; this makes it difficult both for programmers to apply the mechanisms correctly and for compilers to check that resources are properly protected. This paper presents a mechanism to automatically check that shared memory is accessed properly, using a methodology called shared ownership. In contrast to traditional ownership, shared ownership offers more flexibility by permitting multiple owners of a resource. On the basis of this methodology, we define an abstract model of resource access that provides operations to manage data dependencies, as well as sharing and transfer of access privileges. The model is rigorously defined using a formal semantics, and shown to be free from data races. This property can be used to detect unsafe memory accesses when simulating the model together with the execution of a program. The expressiveness and efficiency of the approach is demonstrated on a variety of programs using common synchronization mechanisms.
CYJun 17, 2014
Teaching Software Engineering through RoboticsJiwon Shin, Andrey Rusakov, Bertrand Meyer
This paper presents a newly-developed robotics programming course and reports the initial results of software engineering education in robotics context. Robotics programming, as a multidisciplinary course, puts equal emphasis on software engineering and robotics. It teaches students proper software engineering -- in particular, modularity and documentation -- by having them implement four core robotics algorithms for an educational robot. To evaluate the effect of software engineering education in robotics context, we analyze pre- and post-class survey data and the four assignments our students completed for the course. The analysis suggests that the students acquired an understanding of software engineering techniques and principles.
SEMar 5, 2014
Automated Fixing of Programs with ContractsYu Pei, Carlo A. Furia, Martin Nordio et al.
This paper describes AutoFix, an automatic debugging technique that can fix faults in general-purpose software. To provide high-quality fix suggestions and to enable automation of the whole debugging process, AutoFix relies on the presence of simple specification elements in the form of contracts (such as pre- and postconditions). Using contracts enhances the precision of dynamic analysis techniques for fault detection and localization, and for validating fixes. The only required user input to the AutoFix supporting tool is then a faulty program annotated with contracts; the tool produces a collection of validated fixes for the fault ranked according to an estimate of their suitability. In an extensive experimental evaluation, we applied AutoFix to over 200 faults in four code bases of different maturity and quality (of implementation and of contracts). AutoFix successfully fixed 42% of the faults, producing, in the majority of cases, corrections of quality comparable to those competent programmers would write; the used computational resources were modest, with an average time per fix below 20 minutes on commodity hardware. These figures compare favorably to the state of the art in automated program fixing, and demonstrate that the AutoFix approach is successfully applicable to reduce the debugging burden in real-world scenarios.
SENov 25, 2013
Flexible Invariants Through Semantic CollaborationNadia Polikarpova, Julian Tschannen, Carlo A. Furia et al.
Modular reasoning about class invariants is challenging in the presence of dependencies among collaborating objects that need to maintain global consistency. This paper presents semantic collaboration: a novel methodology to specify and reason about class invariants of sequential object-oriented programs, which models dependencies between collaborating objects by semantic means. Combined with a simple ownership mechanism and useful default schemes, semantic collaboration achieves the flexibility necessary to reason about complicated inter-object dependencies but requires limited annotation burden when applied to standard specification patterns. The methodology is implemented in AutoProof, our program verifier for the Eiffel programming language (but it is applicable to any language supporting some form of representation invariants). An evaluation on several challenge problems proposed in the literature demonstrates that it can handle a variety of idiomatic collaboration patterns, and is more widely applicable than the existing invariant methodologies.
SEAug 5, 2013
Handling Parallelism in a Concurrency ModelMischael Schill, Sebastian Nanz, Bertrand Meyer
Programming models for concurrency are optimized for dealing with nondeterminism, for example to handle asynchronously arriving events. To shield the developer from data race errors effectively, such models may prevent shared access to data altogether. However, this restriction also makes them unsuitable for applications that require data parallelism. We present a library-based approach for permitting parallel access to arrays while preserving the safety guarantees of the original model. When applied to SCOOP, an object-oriented concurrency model, the approach exhibits a negligible performance overhead compared to ordinary threaded implementations of two parallel benchmark programs.
PLJul 11, 2013
Alias and Change Calculi, Applied to Frame InferenceAlexander Kogtenkov, Bertrand Meyer, Sergey Velder
Alias analysis, which determines whether two expressions in a program may reference to the same object, has many potential applications in program construction and verification. We have developed a theory for alias analysis, the "alias calculus", implemented its application to an object-oriented language, and integrated the result into a modern IDE. The calculus has a higher level of precision than many existing alias analysis techniques. One of the principal applications is to allow automatic change analysis, which leads to inferring "modifies clauses", providing a significant advance towards addressing the Frame Problem. Experiments were able to infer the "modifies" clauses of an existing formally specified library. Other applications, in particular to concurrent programming, also appear possible. The article presents the calculus, the application to frame analysis including ex-perimental results, and other projected applications. The ongoing work includes building more efficient model capturing aliasing properties and soundness proof for its essential elements.
SEAug 16, 2012
What Good Are Strong Specifications?Nadia Polikarpova, Carlo A. Furia, Yu Pei et al.
Experience with lightweight formal methods suggests that programmers are willing to write specification if it brings tangible benefits to their usual development activities. This paper considers stronger specifications and studies whether they can be deployed as an incremental practice that brings additional benefits without being unacceptably expensive. We introduce a methodology that extends Design by Contract to write strong specifications of functional properties in the form of preconditions, postconditions, and invariants. The methodology aims at being palatable to developers who are not fluent in formal techniques but are comfortable with writing simple specifications. We evaluate the cost and the benefits of using strong specifications by applying the methodology to testing data structure implementations written in Eiffel and C#. In our extensive experiments, testing against strong specifications detects twice as many bugs as standard contracts, with a reasonable overhead in terms of annotation burden and run-time performance while testing. In the wide spectrum of formal techniques for software quality, testing against strong specifications lies in a "sweet spot" with a favorable benefit to effort ratio.