Florian Lorber

SE
5papers
25citations
Novelty42%
AI Score22

5 Papers

LOAug 28, 2023
Shielded Reinforcement Learning for Hybrid Systems

Asger Horn Brorholt, Peter Gjøl Jensen, Kim Guldstrand Larsen et al.

Safe and optimal controller synthesis for switched-controlled hybrid systems, which combine differential equations and discrete changes of the system's state, is known to be intricately hard. Reinforcement learning has been leveraged to construct near-optimal controllers, but their behavior is not guaranteed to be safe, even when it is encouraged by reward engineering. One way of imposing safety to a learned controller is to use a shield, which is correct by design. However, obtaining a shield for non-linear and hybrid environments is itself intractable. In this paper, we propose the construction of a shield using the so-called barbaric method, where an approximate finite representation of an underlying partition-based two-player safety game is extracted via systematically picked samples of the true transition function. While hard safety guarantees are out of reach, we experimentally demonstrate strong statistical safety guarantees with a prototype implementation and UPPAAL STRATEGO. Furthermore, we study the impact of the synthesized shield when applied as either a pre-shield (applied before learning a controller) or a post-shield (only applied after learning a controller). We experimentally demonstrate superiority of the pre-shielding approach. We apply our technique on a range of case studies, including two industrial examples, and further study post-optimization of the post-shielding approach.

LOJun 30, 2020
It's Time to Play Safe: Shield Synthesis for Timed Systems

Roderick Bloem, Peter Gjøl Jensen, Bettina Könighofer et al.

Erroneous behaviour in safety critical real-time systems may inflict serious consequences. In this paper, we show how to synthesize timed shields from timed safety properties given as timed automata. A timed shield enforces the safety of a running system while interfering with the system as little as possible. We present timed post-shields and timed pre-shields. A timed pre-shield is placed before the system and provides a set of safe outputs. This set restricts the choices of the system. A timed post-shield is implemented after the system. It monitors the system and corrects the system's output only if necessary. We further extend the timed post-shield construction to provide a guarantee on the recovery phase, i.e., the time between a specification violation and the point at which full control can be handed back to the system. In our experimental results, we use timed post-shields to ensure the safety in a reinforcement learning setting for controlling a platoon of cars, during the learning and execution phase, and study the effect.

SEFeb 11, 2019
COST Action IC 1402 ArVI: Runtime Verification Beyond Monitoring -- Activity Report of Working Group 1

Wolfgang Ahrendt, Cyrille Artho, Christian Colombo et al.

This report presents the activities of the first working group of the COST Action ArVI, Runtime Verification beyond Monitoring. The report aims to provide an overview of some of the major core aspects involved in Runtime Verification. Runtime Verification is the field of research dedicated to the analysis of system executions. It is often seen as a discipline that studies how a system run satisfies or violates correctness properties. The report exposes a taxonomy of Runtime Verification (RV) presenting the terminology involved with the main concepts of the field. The report also develops the concept of instrumentation, the various ways to instrument systems, and the fundamental role of instrumentation in designing an RV framework. We also discuss how RV interplays with other verification techniques such as model-checking, deductive verification, model learning, testing, and runtime assertion checking. Finally, we propose challenges in monitoring quantitative and statistical data beyond detecting property violation.

SESep 10, 2018
Effortless Fault Localisation: Conformance Testing of Real-Time Systems in Ecdar

Tobias R. Gundersen, Florian Lorber, Ulrik Nyman et al.

Model checking of real-time systems has evolved throughout the years. Recently, the model checker Ecdar, using timed I/O automata, was used to perform compositional verification. However, in order to fully integrate model checking of real-time systems into industrial development, we need a productive and reliable way to test if such a system conforms to its corresponding model. Hence, we present an extension of Ecdar that integrates conformance testing into a new IDE that now features modelling, verification, and testing. The new tool uses model-based mutation testing, requiring only the model and the system under test, to locate faults and to prove the absence of certain types of faults. It supports testing using either real-time or simulated time. It parallelises test-case generation and test execution to provide a significant speed-up. We also introduce new mutation operators that improve the ability to detect and locate faults. Finally, we conduct a case study with 140 faulty systems, where Ecdar detects all faults.

SEAug 23, 2018
Learning Timed Automata via Genetic Programming

Martin Tappler, Bernhard K. Aichernig, Kim Guldstrand Larsen et al.

Model learning has gained increasing interest in recent years. It derives behavioural models from test data of black-box systems. The main advantage offered by such techniques is that they enable model-based analysis without access to the internals of a system. Applications range from fully automated testing over model checking to system understanding. Current work focuses on learning variations of finite state machines. However, most techniques consider discrete time. In this paper, we present a method for learning timed automata, finite state machines extended with real-valued clocks. The learning method generates a model consistent with a set of timed traces collected by testing. This generation is based on genetic programming, a search-based technique for automatic program creation. We evaluate our approach on 44 timed systems, comprising four systems from the literature and 40 randomly generated examples.