LGMar 22, 2023
Strategy Synthesis in Markov Decision Processes Under Limited Sampling AccessChristel Baier, Clemens Dubslaff, Patrick Wienhöft et al.
A central task in control theory, artificial intelligence, and formal methods is to synthesize reward-maximizing strategies for agents that operate in partially unknown environments. In environments modeled by gray-box Markov decision processes (MDPs), the impact of the agents' actions are known in terms of successor states but not the stochastics involved. In this paper, we devise a strategy synthesis algorithm for gray-box MDPs via reinforcement learning that utilizes interval MDPs as internal model. To compete with limited sampling access in reinforcement learning, we incorporate two novel concepts into our algorithm, focusing on rapid and successful learning rather than on stochastic guarantees and optimality: lower confidence bound exploration reinforces variants of already learned practical strategies and action scoping reduces the learning action space to promising actions. We illustrate benefits of our algorithms by means of a prototypical implementation applied on examples from the AI and formal methods communities.
AIJan 20, 2023
On the Foundations of Cycles in Bayesian NetworksChristel Baier, Clemens Dubslaff, Holger Hermanns et al.
Bayesian networks (BNs) are a probabilistic graphical model widely used for representing expert knowledge and reasoning under uncertainty. Traditionally, they are based on directed acyclic graphs that capture dependencies between random variables. However, directed cycles can naturally arise when cross-dependencies between random variables exist, e.g., for modeling feedback loops. Existing methods to deal with such cross-dependencies usually rely on reductions to BNs without cycles. These approaches are fragile to generalize, since their justifications are intermingled with additional knowledge about the application context. In this paper, we present a foundational study regarding semantics for cyclic BNs that are generic and conservatively extend the cycle-free setting. First, we propose constraint-based semantics that specify requirements for full joint distributions over a BN to be consistent with the local conditional probabilities and independencies. Second, two kinds of limit semantics that formalize infinite unfolding approaches are introduced and shown to be computable by a Markov chain construction.
AINov 13, 2025
Temporal Properties of Conditional Independence in Dynamic Bayesian NetworksRajab Aghamov, Christel Baier, Joel Ouaknine et al.
Dynamic Bayesian networks (DBNs) are compact graphical representations used to model probabilistic systems where interdependent random variables and their distributions evolve over time. In this paper, we study the verification of the evolution of conditional-independence (CI) propositions against temporal logic specifications. To this end, we consider two specification formalisms over CI propositions: linear temporal logic (LTL), and non-deterministic Büchi automata (NBAs). This problem has two variants. Stochastic CI properties take the given concrete probability distributions into account, while structural CI properties are viewed purely in terms of the graphical structure of the DBN. We show that deciding if a stochastic CI proposition eventually holds is at least as hard as the Skolem problem for linear recurrence sequences, a long-standing open problem in number theory. On the other hand, we show that verifying the evolution of structural CI propositions against LTL and NBA specifications is in PSPACE, and is NP- and coNP-hard. We also identify natural restrictions on the graphical structure of DBNs that make the verification of structural CI properties tractable.
LGMay 13, 2023
More for Less: Safe Policy Improvement With Stronger Performance GuaranteesPatrick Wienhöft, Marnix Suilen, Thiago D. Simão et al.
In an offline reinforcement learning setting, the safe policy improvement (SPI) problem aims to improve the performance of a behavior policy according to which sample data has been generated. State-of-the-art approaches to SPI require a high number of samples to provide practical probabilistic guarantees on the improved policy's performance. We present a novel approach to the SPI problem that provides the means to require less data for such guarantees. Specifically, to prove the correctness of these guarantees, we devise implicit transformations on the data set and the underlying environment model that serve as theoretical foundations to derive tighter improvement bounds for SPI. Our empirical evaluation, using the well-established SPI with baseline bootstrapping (SPIBB) algorithm, on standard benchmarks shows that our method indeed significantly reduces the sample complexity of the SPIBB algorithm.
SEJan 18, 2022
Causality in Configurable Software SystemsClemens Dubslaff, Kallistos Weis, Christel Baier et al.
Detecting and understanding reasons for defects and inadvertent behavior in software is challenging due to their increasing complexity. In configurable software systems, the combinatorics that arises from the multitude of features a user might select from adds a further layer of complexity. We introduce the notion of feature causality, which is based on counterfactual reasoning and inspired by the seminal definition of actual causality by Halpern and Pearl. Feature causality operates at the level of system configurations and is capable of identifying features and their interactions that are the reason for emerging functional and non-functional properties. We present various methods to explicate these reasons, in particular well-established notions of responsibility and blame that we extend to the feature-oriented setting. Establishing a close connection of feature causality to prime implicants, we provide algorithms to effectively compute feature causes and causal explications. By means of an evaluation on a wide range of configurable software systems, including community benchmarks and real-world systems, we demonstrate the feasibility of our approach: We illustrate how our notion of causality facilitates to identify root causes, estimate the effects of features, and detect feature interactions.
SEApr 20, 2019
Magnifier: A Compositional Analysis Approach for Autonomous Traffic ControlMaryam Bagheri, Marjan Sirjani, Ehsan Khamespanah et al.
Autonomous traffic control systems are large-scale systems with critical goals. Due to the dynamic nature of the surrounding world of these systems, assuring the satisfaction of their properties at runtime and in the presence of a change is important. A prominent approach to assure the correct behavior of these systems is verification at runtime, which has strict time and memory limitations. To tackle these limitations, we propose Magnifier, an iterative, incremental, and compositional verification approach that operates on a component-based model. The Magnifier idea is zooming on the component affected by a change, verifying the correctness of properties of interest of the system after adapting the component to the change, and then zooming out and tracing the change if it propagates. If the change propagates, all components affected by the change are adapted and are composed to form a new component. Magnifier repeats the same process for the new component. This iterative process terminates whenever the propagation of the change stops. In Magnifier, we use the Coordinated Adaptive Actor model (CoodAA) of traffic control systems. We present a formal semantics for CoodAA as a network of Timed Input-Output Automata (TIOAs). The change does not propagate if TIOAs of the adapted component and its environment are compatible. We implement our approach in Ptolemy II. The results of our experiments indicate that the proposed approach improves the verification time and the memory consumption compared to a non-compositional approach.
SYJul 11, 2017
Synthesis of Optimal Resilient Control StrategiesChristel Baier, Clemens Dubslaff, Ľuboš Korenčiak et al.
Repair mechanisms are important within resilient systems to maintain the system in an operational state after an error occurred. Usually, constraints on the repair mechanisms are imposed, e.g., concerning the time or resources required (such as energy consumption or other kinds of costs). For systems modeled by Markov decision processes (MDPs), we introduce the concept of resilient schedulers, which represent control strategies guaranteeing that these constraints are always met within some given probability. Assigning rewards to the operational states of the system, we then aim towards resilient schedulers which maximize the long-run average reward, i.e., the expected mean payoff. We present a pseudo-polynomial algorithm that decides whether a resilient scheduler exists and if so, yields an optimal resilient scheduler. We show also that already the decision problem asking whether there exists a resilient scheduler is PSPACE-hard.
SEDec 30, 2013
Probabilistic Model Checking for Energy Analysis in Software Product LinesClemens Dubslaff, Sascha Klüppelholz, Christel Baier
In a software product line (SPL), a collection of software products is defined by their commonalities in terms of features rather than explicitly specifying all products one-by-one. Several verification techniques were adapted to establish temporal properties of SPLs. Symbolic and family-based model checking have been proven to be successful for tackling the combinatorial blow-up arising when reasoning about several feature combinations. However, most formal verification approaches for SPLs presented in the literature focus on the static SPLs, where the features of a product are fixed and cannot be changed during runtime. This is in contrast to dynamic SPLs, allowing to adapt feature combinations of a product dynamically after deployment. The main contribution of the paper is a compositional modeling framework for dynamic SPLs, which supports probabilistic and nondeterministic choices and allows for quantitative analysis. We specify the feature changes during runtime within an automata-based coordination component, enabling to reason over strategies how to trigger dynamic feature changes for optimizing various quantitative objectives, e.g., energy or monetary costs and reliability. For our framework there is a natural and conceptually simple translation into the input language of the prominent probabilistic model checker PRISM. This facilitates the application of PRISM's powerful symbolic engine to the operational behavior of dynamic SPLs and their family-based analysis against various quantitative queries. We demonstrate feasibility of our approach by a case study issuing an energy-aware bonding network device.