Bernd Finkbeiner

h-index9

35papers

384citations

Novelty46%

AI Score54

Ranked #27,018 of 201,326 authors (top 13%)#10 in LO (top 3%)

35 Papers

SEJun 4, 2022

Formal Specifications from Natural Language

Christopher Hahn, Frederik Schmitt, Julia J. Tillman et al.

We study the generalization abilities of language models when translating natural language into formal specifications with complex semantics. In particular, we fine-tune language models on three datasets consisting of English sentences and their corresponding formal representation: 1) regular expressions (regex), frequently used in programming and search; 2) First-order logic (FOL), commonly used in software verification and theorem proving; and 3) linear-time temporal logic (LTL), which forms the basis for industrial hardware specification languages. Our experiments show that, in these diverse domains, the language models maintain their generalization capabilities from pre-trained knowledge of natural language to generalize, e.g., to new variable names or operator descriptions. Additionally, they achieve competitive performance, and even outperform the state-of-the-art for translating into regular expressions, with the benefits of being easy to access, efficient to fine-tune, and without a particular need for domain-specific reasoning.

LGMar 2, 2023

Iterative Circuit Repair Against Formal Specifications

Matthias Cosler, Frederik Schmitt, Christopher Hahn et al.

We present a deep learning approach for repairing sequential circuits against formal specifications given in linear-time temporal logic (LTL). Given a defective circuit and its formal specification, we train Transformer models to output circuits that satisfy the corresponding specification. We propose a separated hierarchical Transformer for multimodal representation learning of the formal specification and the circuit. We introduce a data generation algorithm that enables generalization to more complex specifications and out-of-distribution datasets. In addition, our proposed repair mechanism significantly improves the automated synthesis of circuits from LTL specifications with Transformers. It improves the state-of-the-art by $6.8$ percentage points on held-out instances and $11.8$ percentage points on an out-of-distribution dataset from the annual reactive synthesis competition.

LOJun 15, 2023

Counterfactuals Modulo Temporal Logics

Bernd Finkbeiner, Julian Siber

Lewis' theory of counterfactuals is the foundation of many contemporary notions of causality. In this paper, we extend this theory in the temporal direction to enable symbolic counterfactual reasoning on infinite sequences, such as counterexamples found by a model checker and trajectories produced by a reinforcement learning agent. In particular, our extension considers a more relaxed notion of similarity between worlds and proposes two additional counterfactual operators that close a semantic gap between the previous two in this more general setting. Further, we consider versions of counterfactuals that minimize the distance to the witnessing counterfactual worlds, a common requirement in causal analysis. To automate counterfactual reasoning in the temporal domain, we introduce a logic that combines temporal and counterfactual operators, and outline decision procedures for the satisfiability and trace-checking problems of this logic.

LONov 23, 2016

What You Really Need To Know About Your Neighbor

Werner Damm, Bernd Finkbeiner, Astrid Rakow

A fundamental question in system design is to decide how much of the design of one component must be known in order to successfully design another component of the system. We study this question in the setting of reactive synthesis, where one constructs a system implementation from a specification given in temporal logic. In previous work, we have shown that the system can be constructed compositionally, one component at a time, if the specification admits a "dominant" (as explained in Introduction) strategy for each component. In this paper, we generalize the approach to settings where dominant strategies only exist under certain assumptions about the future behavior of the other components. We present an incremental synthesis method based on the automatic construction of such assumptions.

74.5LOMay 26

Almost Fair Simulations

Arthur Correnson, Iona Kuhn, Bernd Finkbeiner

It is well known that liveness properties cannot be proven using standard simulation arguments. This issue has been mitigated by extending standard notions of simulation for transition systems to fairness-preserving simulations for systems equipped with an additional fairness condition modeling liveness assumptions and/or liveness requirements. In the context of automated verification of finite-state systems, proofs by simulation are an appealing method as there exist efficient algorithms to find a simulation between two systems. However, applications of fair simulation to interactive verification have been much less studied. Perhaps one reason is that the definitions of fair simulation relations typically involve non-trivial nestings of inductive and coinductive relations, making them particularly difficult to use and to reason about. In this paper, we argue that in many cases, stronger notions of fair simulation involving more controlled alternations of fixed points are sufficient. Starting from known fair simulation techniques, we progressively build up a family of almost fair simulation relations for transition systems equipped with a Buechi fairness condition. The simulation relations we present can all be equipped with intuitive reasoning rules, leading to elegant deductive systems to prove fair trace inclusion. We mechanized our simulation relations and their associated deductive systems in the Rocq proof assistant, proved their soundness, and we demonstrate their use through a selection of examples.

29.0PLMay 26

Pacing Types for Asynchronous Stream Equations

Florian Kohn, Arthur Correnson, Jan Baumeister et al.

Stream-based monitoring is a runtime verification approach where a monitor aggregates streams of input data from sensors and other sources to give real-time statistics and assessments of a system's health. One of the central challenges in designing reliable stream-based monitors is to deal with the asynchronous nature of data streams: in concrete applications, the different sensors being monitored produce values at different speeds, and it is the monitor's responsibility to correctly react to the asynchronous arrival of different streams of values. To ease this process, modern frameworks for stream-based monitoring such as RTLola enable users to finely specify data synchronization policies via a system of pacing annotations. While this feature simplifies the design of monitors, it can also lead users to write inconsistent policies, where synchronization between two streams is explicitly requested via annotations, but cannot always be achieved. To mitigate this issue, this paper presents pacing types, a novel type system implemented in RTLola to ensure that monitors for asynchronous streams are free of timing inconsistencies. We give a formal semantics to pacing annotations for a core fragment of RTLola, and present a soundness proof of the pacing type system. For an additional level of guarantees, we machine-checked the soundness proof using the Rocq proof assistant.

LGMay 30, 2022

Attention Flows for General Transformers

Niklas Metzger, Christopher Hahn, Julian Siber et al.

In this paper, we study the computation of how much an input token in a Transformer model influences its prediction. We formalize a method to construct a flow network out of the attention values of encoder-only Transformer models and extend it to general Transformer architectures including an auto-regressive decoder. We show that running a maxflow algorithm on the flow network construction yields Shapley values, which determine the impact of a player in cooperative game theory. By interpreting the input tokens in the flow network as players, we can compute their influence on the total attention flow leading to the decoder's decision. Additionally, we provide a library that computes and visualizes the attention flow of arbitrary Transformer models. We show the usefulness of our implementation on various models trained on natural language processing and reasoning tasks.

25.6SEMar 11

Type-safe Monitoring of Parameterized Streams

Jan Baumeister, Bernd Finkbeiner, Florian Kohn

Stream-based monitoring is a real-time safety assurance mechanism for complex cyber-physical systems such as unmanned aerial vehicles. The monitor aggregates streams of input data from sensors and other sources to give real-time statistics and assessments of the system's health. Since the monitor is a safety-critical component, it is mandatory to ensure the absence of runtime errors in the monitor. Providing such guarantees is particularly challenging when the monitor must handle unbounded data domains, like an unlimited number of airspace participants, requiring the use of dynamic data structures. This paper provides a type-safe integration of parameterized streams into the stream-based monitoring framework RTLola. Parameterized streams generalize individual streams to sets of an unbounded number of stream instances and provide a systematic mechanism for memory management. We show that the absence of runtime errors is, in general, undecidable but can be effectively ensured with a refinement type system that guarantees all memory references are either successful or backed by a default value. We report on the performance of the type analysis on example specifications from a range of benchmarks, including specifications from the monitoring of autonomous aircraft.

15.0CRMay 22

Less Effort, Shorter Proofs: Reinforcement Learning for Security Protocol Analysis in Tamarin

Matthias Cosler, Cas Cremers, Bernd Finkbeiner et al.

Tools like Tamarin and ProVerif have achieved notable success in analyzing and verifying complex real-world protocols such as EMV, 5G, and WPA2, even detecting zero-day exploits. Despite these successes, verifying such protocols remains a time-consuming, challenging task, often requiring significant human effort and expertise. In this paper, we present a reinforcement learning (RL) framework inspired by AlphaZero and AlphaProof that implements a new style of proof search for Tamarin. We have developed a stateless API for Tamarin that acts as a classical RL environment. We guide a Monte Carlo Tree Search (MCTS) by a neural heuristic that learns from completed subproofs. We evaluate our framework on 16 case studies, ranging from classical protocol models to challenging state-of-the-art protocol models from recent publications. Our method finds more proofs automatically than Tamarin's standard search and produces shorter proofs than both the standard and human-engineered heuristics. Our pipeline is applicable out of the box to assist Tamarin users in active research, reducing the human effort required. Moreover, our standardized interface provides a programmatic way for users to interact with Tamarin. Finally, our work demonstrates the promising potential of adapting RL-based methods to the Tamarin domain.

LOSep 10, 2025

Trace Repair for Temporal Behavior Trees

Sebastian Schirmer, Philipp Schitz, Johann C. Dauer et al.

We present methods for repairing traces against specifications given as temporal behavior trees (TBT). TBT are a specification formalism for action sequences in robotics and cyber-physical systems, where specifications of sub-behaviors, given in signal temporal logic, are composed using operators for sequential and parallel composition, fallbacks, and repetition. Trace repairs are useful to explain failures and as training examples that avoid the observed problems. In principle, repairs can be obtained via mixed-integer linear programming (MILP), but this is far too expensive for practical applications. We present two practical repair strategies: (1) incremental repair, which reduces the MILP by splitting the trace into segments, and (2) landmark-based repair, which solves the repair problem iteratively using TBT's robust semantics as a heuristic that approximates MILP with more efficient linear programming. In our experiments, we were able to repair traces with more than 25,000 entries in under ten minutes, while MILP runs out of memory.

AIDec 29, 2025

On Conformant Planning and Model-Checking of $\exists^*\forall^*$ Hyperproperties

Raven Beutner, Bernd Finkbeiner

We study the connection of two problems within the planning and verification community: Conformant planning and model-checking of hyperproperties. Conformant planning is the task of finding a sequential plan that achieves a given objective independent of non-deterministic action effects during the plan's execution. Hyperproperties are system properties that relate multiple execution traces of a system and, e.g., capture information-flow and fairness policies. In this paper, we show that model-checking of $\exists^*\forall^*$ hyperproperties is closely related to the problem of computing a conformant plan. Firstly, we show that we can efficiently reduce a hyperproperty model-checking instance to a conformant planning instance, and prove that our encoding is sound and complete. Secondly, we establish the converse direction: Every conformant planning problem is, itself, a hyperproperty model-checking task.

83.3LGMay 14

Natural Synthesis: Outperforming Reactive Synthesis Tools with Large Reasoning Models

Frederik Schmitt, Matthias Cosler, Niklas Metzger et al.

Reactive synthesis, the problem of automatically constructing a hardware circuit from a logical specification, is a long-standing challenge in formal verification. It is elusive for two reasons: It is algorithmically hard, and writing formal specifications by hand is notoriously difficult. In this paper, we tackle both sides of the problem. For the algorithmic side, we present a neuro-symbolic approach to reactive synthesis that couples large reasoning models with model checkers to iteratively repair a synthesized Verilog implementation via sound symbolic feedback. Our approach solves more benchmarks than the best dedicated tools in the annual synthesis competition and extends to constructing parameterized systems, a problem known to be undecidable. On the specification side, we introduce an autoformalization step that shifts the specification task from temporal logic to natural language by introducing a hand-authored dataset of natural-language specifications for evaluation. We demonstrate performance comparable to that of starting from formal specifications, establishing natural synthesis as a viable end-to-end workflow.

19.9CRMay 4

Differentially Private Runtime Monitoring

Bernd Finkbeiner, Frederik Scheerer

Modern stream-based monitors collect detailed statistics of the runtime behavior of the system under observation. If the system runs in a privacy-sensitive context, this poses the risk of disclosing sensitive information. Differential privacy is the state-of-the-art approach for protecting sensitive information, however, integrating it into runtime monitoring is challenging: temporal operators can cause individual input values to influence multiple outputs over time, leading to repeated disclosure of private information. We propose an approach that automatically enforces differential privacy in stream-based monitoring specifications by analyzing temporal dependencies and injecting carefully calibrated noise into the specification. To preserve the utility of the outputs, we identify strategically chosen positions in the specification for noise injection and leverage tree-based mechanisms to mitigate the accuracy loss caused by noise injected into aggregation operators. We demonstrate the practicality and effectiveness of our approach in a case study on monitoring public transportation usage.

31.9LOMay 3

Knowledge Compilation for Quantification in Alternating Automata

S. Akshay, Alfredo Cantarella, Supratik Chakraborty et al.

We present a knowledge compilation approach for existential and universal quantification in alternating automata. Knowledge compilation transforms formulas into normal forms with special properties that enable efficient answering of questions of interest. For Boolean formulas, several normal forms that have proven effective for existential/universal quantification, and even for functional synthesis, have been studied in the literature. For infinite word automata, quantification is a fundamental operation in verification tasks such as QPTL satisfiability checking and HyperLTL model checking. Existing algorithms rely on nondeterministic infinite word automata, where existential projection can be efficiently performed state-wise, but universal projection requires complementation. Complementing nondeterministic infinite word automata, however, is expensive in practice, making existing algorithms infeasible for automata in practice. Towards addressing this problem, we propose novel knowledge compilation techniques for existential and universal quantification on alternating safety automata. Our approach compiles alternating automata into normal forms where projection can be applied uniformly and efficiently to each state's transition function. Using the compilations for each type of quantification, we can effectively eliminate a sequence of alternating quantifiers in formulas without complementation. Our BDD-based prototype demonstrates the practical effectiveness of our algorithms on a suite of QPTL satisfiability benchmarks.

LOMay 22, 2024

Non-Deterministic Planning for Hyperproperty Verification

Raven Beutner, Bernd Finkbeiner

Non-deterministic planning aims to find a policy that achieves a given objective in an environment where actions have uncertain effects, and the agent - potentially - only observes parts of the current state. Hyperproperties are properties that relate multiple paths of a system and can, e.g., capture security and information-flow policies. Popular logics for expressing temporal hyperproperties - such as HyperLTL - extend LTL by offering selective quantification over executions of a system. In this paper, we show that planning offers a powerful intermediate language for the automated verification of hyperproperties. Concretely, we present an algorithm that, given a HyperLTL verification problem, constructs a non-deterministic multi-agent planning instance (in the form of a QDec-POMDP) that, when admitting a plan, implies the satisfaction of the verification problem. We show that for large fragments of HyperLTL, the resulting planning instance corresponds to a classical, FOND, or POND planning problem. We implement our encoding in a prototype verification tool and report on encouraging experimental results.

LOApr 15, 2024

Monitoring Second-Order Hyperproperties

Raven Beutner, Bernd Finkbeiner, Hadar Frenkel et al.

Hyperproperties express the relationship between multiple executions of a system. This is needed in many AI-related fields, such as knowledge representation and planning, to capture system properties related to knowledge, information flow, and privacy. In this paper, we study the monitoring of complex hyperproperties at runtime. Previous work in this area has either focused on the simpler problem of monitoring trace properties (which are sets of traces, while hyperproperties are sets of sets of traces) or on monitoring first-order hyperproperties, which are expressible in temporal logics with first-order quantification over traces, such as HyperLTL. We present the first monitoring algorithm for the much more expressive class of second-order hyperproperties. Second-order hyperproperties include system properties like common knowledge, which cannot be expressed in first-order logics like HyperLTL. We introduce Hyper$^2$LTL$_f$, a temporal logic over finite traces that allows for second-order quantification over sets of traces. We study the monitoring problem in two fundamental execution models: (1) the parallel model, where a fixed number of traces is monitored in parallel, and (2) the sequential model, where an unbounded number of traces is observed sequentially, one trace after the other. For the parallel model, we show that the monitoring of the second-order hyperproperties of Hyper$^2$LTL$_f$ can be reduced to monitoring first-order hyperproperties. For the sequential model, we present a monitoring algorithm that handles second-order quantification efficiently, exploiting optimizations based on the monotonicity of subformulas, graph-based storing of executions, and fixpoint hashing. We present experimental results from a range of benchmarks, including examples from common knowledge and planning.

MAMar 20, 2024

Hyper Strategy Logic

Raven Beutner, Bernd Finkbeiner

Strategy logic (SL) is a powerful temporal logic that enables strategic reasoning in multi-agent systems. SL supports explicit (first-order) quantification over strategies and provides a logical framework to express many important properties such as Nash equilibria, dominant strategies, etc. While in SL the same strategy can be used in multiple strategy profiles, each such profile is evaluated w.r.t. a path-property, i.e., a property that considers the single path resulting from a particular strategic interaction. In this paper, we present Hyper Strategy Logic (HyperSL), a strategy logic where the outcome of multiple strategy profiles can be compared w.r.t. a hyperproperty, i.e., a property that relates multiple paths. We show that HyperSL can capture important properties that cannot be expressed in SL, including non-interference, quantitative Nash equilibria, optimal adversarial planning, and reasoning under imperfect information. On the algorithmic side, we identify an expressive fragment of HyperSL with decidable model checking and present a model-checking algorithm. We contribute a prototype implementation of our algorithm and report on encouraging experimental results.

LGJan 30, 2025

Stream-Based Monitoring of Algorithmic Fairness

Jan Baumeister, Bernd Finkbeiner, Frederik Scheerer et al.

Automatic decision and prediction systems are increasingly deployed in applications where they significantly impact the livelihood of people, such as for predicting the creditworthiness of loan applicants or the recidivism risk of defendants. These applications have given rise to a new class of algorithmic-fairness specifications that require the systems to decide and predict without bias against social groups. Verifying these specifications statically is often out of reach for realistic systems, since the systems may, e.g., employ complex learning components, and reason over a large input space. In this paper, we therefore propose stream-based monitoring as a solution for verifying the algorithmic fairness of decision and prediction systems at runtime. Concretely, we present a principled way to formalize algorithmic fairness over temporal data streams in the specification language RTLola and demonstrate the efficacy of this approach on a number of benchmarks. Besides synthetic scenarios that particularly highlight its efficiency on streams with a scaling amount of data, we notably evaluate the monitor on real-world data from the recidivism prediction tool COMPAS.

LGFeb 13, 2024

Learning Better Representations From Less Data For Propositional Satisfiability

Mohamed Ghanem, Frederik Schmitt, Julian Siber et al.

Training neural networks on NP-complete problems typically demands very large amounts of training data and often needs to be coupled with computationally expensive symbolic verifiers to ensure output correctness. In this paper, we present NeuRes, a neuro-symbolic approach to address both challenges for propositional satisfiability, being the quintessential NP-complete problem. By combining certificate-driven training and expert iteration, our model learns better representations than models trained for classification only, with a much higher data efficiency -- requiring orders of magnitude less training data. NeuRes employs propositional resolution as a proof system to generate proofs of unsatisfiability and to accelerate the process of finding satisfying truth assignments, exploring both possibilities in parallel. To realize this, we propose an attention-based architecture that autoregressively selects pairs of clauses from a dynamic formula embedding to derive new clauses. Furthermore, we employ expert iteration whereby model-generated proofs progressively replace longer teacher proofs as the new ground truth. This enables our model to reduce a dataset of proofs generated by an advanced solver by ~32% after training on it with no extra guidance. This shows that NeuRes is not limited by the optimality of the teacher algorithm owing to its self-improving workflow. We show that our model achieves far better performance than NeuroSAT in terms of both correctly classified and proven instances.

AIDec 19, 2023

On Alternating-Time Temporal Logic, Hyperproperties, and Strategy Sharing

Raven Beutner, Bernd Finkbeiner

Alternating-time temporal logic (ATL$^*$) is a well-established framework for formal reasoning about multi-agent systems. However, while ATL$^*$ can reason about the strategic ability of agents (e.g., some coalition $A$ can ensure that a goal is reached eventually), we cannot compare multiple strategic interactions, nor can we require multiple agents to follow the same strategy. For example, we cannot state that coalition $A$ can reach a goal sooner (or more often) than some other coalition $A'$. In this paper, we propose HyperATLS$^*_S$, an extension of ATL$^*$ in which we can (1) compare the outcome of multiple strategic interactions w.r.t. a hyperproperty, i.e., a property that refers to multiple paths at the same time, and (2) enforce that some agents share the same strategy. We show that HyperATL$^*_S$ is a rich specification language that captures important AI-related properties that were out of reach of existing logics. We prove that model checking of HyperATL$^*_S$ on concurrent game structures is decidable. We implement our model-checking algorithm in a tool we call HyMASMC and evaluate it on a range of benchmarks.

LOOct 4, 2025

Strategy Logic, Imperfect Information, and Hyperproperties

Raven Beutner, Bernd Finkbeiner

Strategy logic (SL) is a powerful temporal logic that enables first-class reasoning over strategic behavior in multi-agent systems (MAS). In many MASs, the agents (and their strategies) cannot observe the global state of the system, leading to many extensions of SL centered around imperfect information, such as strategy logic with imperfect information (SL$_\mathit{ii}$). Along orthogonal lines, researchers have studied the combination of strategic behavior and hyperproperties. Hyperproperties are system properties that relate multiple executions in a system and commonly arise when specifying security policies. Hyper Strategy Logic (HyperSL) is a temporal logic that combines quantification over strategies with the ability to express hyperproperties on the executions of different strategy profiles. In this paper, we study the relation between SL$_\mathit{ii}$ and HyperSL. Our main result is that both logics (restricted to formulas where no state formulas are nested within path formulas) are equivalent in the sense that we can encode SL$_\mathit{ii}$ instances into HyperSL instances and vice versa. For the former direction, we build on the well-known observation that imperfect information is a hyperproperty. For the latter direction, we construct a self-composition of MASs and show how we can simulate hyperproperties using imperfect information.

LOSep 1, 2025

An Information-Flow Perspective on Explainability Requirements: Specification and Verification

Bernd Finkbeiner, Hadar Frenkel, Julian Siber

Explainable systems expose information about why certain observed effects are happening to the agents interacting with them. We argue that this constitutes a positive flow of information that needs to be specified, verified, and balanced against negative information flow that may, e.g., violate privacy guarantees. Since both explainability and privacy require reasoning about knowledge, we tackle these tasks with epistemic temporal logic extended with quantification over counterfactual causes. This allows us to specify that a multi-agent system exposes enough information such that agents acquire knowledge on why some effect occurred. We show how this principle can be used to specify explainability as a system-level requirement and provide an algorithm for checking finite-state models against such specifications. We present a prototype implementation of the algorithm and evaluate it on several benchmarks, illustrating how our approach distinguishes between explainable and unexplainable systems, and how it allows to pose additional privacy requirements.

LGOct 2, 2025

Learning Representations Through Contrastive Neural Model Checking

Vladimir Krsmanovic, Matthias Cosler, Mohamed Ghanem et al.

Model checking is a key technique for verifying safety-critical systems against formal specifications, where recent applications of deep learning have shown promise. However, while ubiquitous for vision and language domains, representation learning remains underexplored in formal verification. We introduce Contrastive Neural Model Checking (CNML), a novel method that leverages the model checking task as a guiding signal for learning aligned representations. CNML jointly embeds logical specifications and systems into a shared latent space through a self-supervised contrastive objective. On industry-inspired retrieval tasks, CNML considerably outperforms both algorithmic and neural baselines in cross-modal and intra-modal settings. We further show that the learned representations effectively transfer to downstream tasks and generalize to more complex formulas. These findings demonstrate that model checking can serve as an objective for learning representations for formal languages.

HCAug 8, 2021

Visual Analysis of Hyperproperties for Understanding Model Checking Results

Tom Horak, Norine Coenen, Niklas Metzger et al.

Model checkers provide algorithms for proving that a mathematical model of a system satisfies a given specification. In case of a violation, a counterexample that shows the erroneous behavior is returned. Understanding these counterexamples is challenging, especially for hyperproperty specifications, i.e., specifications that relate multiple executions of a system to each other. We aim to facilitate the visual analysis of such counterexamples through our HyperVis tool, which provides interactive visualizations of the given model, specification, and counterexample. Within an iterative and interdisciplinary design process, we developed visualization solutions that can effectively communicate the core aspects of the model checking result. Specifically, we introduce graphical representations of binary values for improving pattern recognition, color encoding for better indicating related aspects, visually enhanced textual descriptions, as well as extensive cross-view highlighting mechanisms. Further, through an underlying causal analysis of the counterexample, we are also able to identify values that contributed to the violation and use this knowledge for both improved encoding and highlighting. Finally, the analyst can modify both the specification of the hyperproperty and the system directly within HyperVis and initiate the model checking of the new version. In combination, these features notably support the analyst in understanding the error leading to the counterexample as well as iterating the provided system and specification. We ran multiple case studies with HyperVis and tested it with domain experts in qualitative feedback sessions. The participants' positive feedback confirms the considerable improvement over the manual, text-based status quo and the value of the tool for explaining hyperproperties.

LGJul 25, 2021

Neural Circuit Synthesis from Specification Patterns

Frederik Schmitt, Christopher Hahn, Markus N. Rabe et al.

We train hierarchical Transformers on the task of synthesizing hardware circuits directly out of high-level logical specifications in linear-time temporal logic (LTL). The LTL synthesis problem is a well-known algorithmic challenge with a long history and an annual competition is organized to track the improvement of algorithms and tooling over time. New approaches using machine learning might open a lot of possibilities in this area, but suffer from the lack of sufficient amounts of training data. In this paper, we consider a method to generate large amounts of additional training data, i.e., pairs of specifications and circuits implementing them. We ensure that this synthetic data is sufficiently close to human-written specifications by mining common patterns from the specifications used in the synthesis competitions. We show that hierarchical Transformers trained on this synthetic data solve a significant portion of problems from the synthesis competitions, and even out-of-distribution examples from a recent case study.

SEDec 15, 2020

Verified Rust Monitors for Lola Specifications

Bernd Finkbeiner, Stefan Oswald, Noemi Passing et al.

The safety of cyber-physical systems rests on the correctness of their monitoring mechanisms. This is problematic if the specification of the monitor is implemented manually or interpreted by unreliable software. We present a verifying compiler that translates specifications given in the stream-based monitoring language Lola to implementations in Rust. The generated code contains verification annotations that enable the Viper toolkit to automatically prove functional correctness, absence of memory faults, and guaranteed termination. The compiler parallelizes the evaluation of different streams in the monitor based on a dependency analysis of the specification. We present encouraging experimental results obtained with monitor specifications found in the literature. For every specification, our approach was able to either produce a correctness proof or to uncover errors in the specification.

ROMar 27, 2020

RTLola Cleared for Take-Off: Monitoring Autonomous Aircraft

Jan Baumeister, Bernd Finkbeiner, Sebastian Schirmer et al.

The autonomous control of unmanned aircraft is a highly safety-critical domain with great economic potential in a wide range of application areas, including logistics, agriculture, civil engineering, and disaster recovery. We report on the development of a dynamic monitoring framework for the DLR ARTIS (Autonomous Rotorcraft Testbed for Intelligent Systems) family of unmanned aircraft based on the formal specification language RTLola. RTLola is a stream-based specification language for real-time properties. An RTLola specification of hazardous situations and system failures is statically analyzed in terms of consistency and resource usage and then automatically translated into an FPGA-based monitor. Our approach leads to highly efficient, parallelized monitors with formal guarantees on the noninterference of the monitor with the normal operation of the autonomous system.

LOMar 6, 2020

Teaching Temporal Logics to Neural Networks

Christopher Hahn, Frederik Schmitt, Jens U. Kreber et al.

We study two fundamental questions in neuro-symbolic computing: can deep learning tackle challenging problems in logics end-to-end, and can neural networks learn the semantics of logics. In this work we focus on linear-time temporal logic (LTL), as it is widely used in verification. We train a Transformer on the problem to directly predict a solution, i.e. a trace, to a given LTL formula. The training data is generated with classical solvers, which, however, only provide one of many possible solutions to each formula. We demonstrate that it is sufficient to train on those particular solutions to formulas, and that Transformers can predict solutions even to formulas from benchmarks from the literature on which the classical solver timed out. Transformers also generalize to the semantics of the logics: while they often deviate from the solutions found by the classical solvers, they still predict correct solutions to most formulas.

LOJan 1, 2019

Proceedings 3rd Workshop on formal reasoning about Causation, Responsibility, and Explanations in Science and Technology

Bernd Finkbeiner, Samantha Kleinberg

The CREST 2018 workshop is the third in a series of workshops addressing formal approaches to reasoning about causation in systems engineering. The topic of formally identifying the cause(s) of specific events - usually some form of failures -, and explaining why they occurred, are increasingly in the focus of several, disjoint communities. The main objective of CREST is to bring together researchers and practitioners from industry and academia in order to enable discussions how explicit and implicit reasoning about causation is performed. A further objective is to link to the foundations of causal reasoning in the philosophy of sciences and to causal reasoning performed in other areas of computer science, engineering, and beyond.

SEMar 29, 2018

Stream Runtime Monitoring on UAS

Florian-Michael Adolf, Peter Faymonville, Bernd Finkbeiner et al.

Unmanned Aircraft Systems (UAS) with autonomous decision-making capabilities are of increasing interest for a wide area of applications such as logistics and disaster recovery. In order to ensure the correct behavior of the system and to recognize hazardous situations or system faults, we applied stream runtime monitoring techniques within the DLR ARTIS (Autonomous Research Testbed for Intelligent System) family of unmanned aircraft. We present our experience from specification elicitation, instrumentation, offline log-file analysis, and online monitoring on the flight computer on a test rig. The debugging and health management support through stream runtime monitoring techniques have proven highly beneficial for system design and development. At the same time, the project has identified usability improvements to the specification language, and has influenced the design of the language.

LOMar 27, 2018

Facets of Software Doping

Gilles Barthe, Pedro R. D'Argenio, Bernd Finkbeiner et al.

This paper provides an informal discussion of the formal aspects of software doping.

FLMar 25, 2018

Synthesizing Skeletons for Reactive Systems

Bernd Finkbeiner, Hazem Torfah

We present an analysis technique for temporal specifications of reactive systems that identifies, on the level of individual system outputs over time, which parts of the implementation are determined by the specification, and which parts are still open. This information is represented in the form of a labeled transition system, which we call skeleton. Each state of the skeleton is labeled with a three-valued assignment to the output variables: each output can be true, false, or open, where true or false means that the value must be true or false, respectively, and open means that either value is still possible. We present algorithms for the verification of skeletons and for the learning-based synthesis of skeletons from specifications in linear-time temporal logic (LTL). The algorithm returns a skeleton that satisfies the given LTL specification in time polynomial in the size of the minimal skeleton. Our new analysis technique can be used to recognize and repair specifications that underspecify critical situations. The technique thus complements existing methods for the recognition and repair of overspecifications via the identification of unrealizable cores.

LOAug 29, 2017

Verifying Security Policies in Multi-agent Workflows with Loops

Bernd Finkbeiner, Christian Müller, Helmut Seidl et al.

We consider the automatic verification of information flow security policies of web-based workflows, such as conference submission systems like EasyChair. Our workflow description language allows for loops, non-deterministic choice, and an unbounded number of participating agents. The information flow policies are specified in a temporal logic for hyperproperties. We show that the verification problem can be reduced to the satisfiability of a formula of first-order linear-time temporal logic, and provide decidability results for relevant classes of workflows and specifications. We report on experimental results obtained with an implementation of our approach on a series of benchmarks.

LOOct 14, 2016

The First-Order Logic of Hyperproperties

Bernd Finkbeiner, Martin Zimmermann

We investigate the logical foundations of hyperproperties. Hyperproperties generalize trace properties, which are sets of traces, to sets of sets of traces. The most prominent application of hyperproperties is information flow security: information flow policies characterize the secrecy and integrity of a system by comparing two or more execution traces, for example by comparing the observations made by an external observer on execution traces that result from different values of a secret variable. In this paper, we establish the first connection between temporal logics for hyperproperties and first-order logic. Kamp's seminal theorem (in the formulation due to Gabbay et al.) states that linear-time temporal logic (LTL) is expressively equivalent to first-order logic over the natural numbers with order. We introduce first-order logic over sets of traces and prove that HyperLTL, the extension of LTL to hyperproperties, is strictly subsumed by this logic. We furthermore exhibit a fragment that is expressively equivalent to HyperLTL, thereby establishing Kamp's theorem for hyperproperties.

LOJun 27, 2013

A Temporal Logic for Hyperproperties

Bernd Finkbeiner, Markus N. Rabe, César Sánchez

Hyperproperties, as introduced by Clarkson and Schneider, characterize the correctness of a computer program as a condition on its set of computation paths. Standard temporal logics can only refer to a single path at a time, and therefore cannot express many hyperproperties of interest, including noninterference and other important properties in security and coding theory. In this paper, we investigate an extension of temporal logic with explicit path variables. We show that the quantification over paths naturally subsumes other extensions of temporal logic with operators for information flow and knowledge. The model checking problem for temporal logic with path quantification is decidable. For alternation depth 1, the complexity is PSPACE in the length of the formula and NLOGSPACE in the size of the system, as for linear-time temporal logic.