AIMay 28, 2025
Efficient Dynamic Shielding for Parametric Safety SpecificationsDavide Corsi, Kaushik Mallik, Andoni Rodriguez et al.
Shielding has emerged as a promising approach for ensuring safety of AI-controlled autonomous systems. The algorithmic goal is to compute a shield, which is a runtime safety enforcement tool that needs to monitor and intervene the AI controller's actions if safety could be compromised otherwise. Traditional shields are designed statically for a specific safety requirement. Therefore, if the safety requirement changes at runtime due to changing operating conditions, the shield needs to be recomputed from scratch, causing delays that could be fatal. We introduce dynamic shields for parametric safety specifications, which are succinctly represented sets of all possible safety specifications that may be encountered at runtime. Our dynamic shields are statically designed for a given safety parameter set, and are able to dynamically adapt as the true safety specification (permissible by the parameters) is revealed at runtime. The main algorithmic novelty lies in the dynamic adaptation procedure, which is a simple and fast algorithm that utilizes known features of standard safety shields, like maximal permissiveness. We report experimental results for a robot navigation problem in unknown territories, where the safety specification evolves as new obstacles are discovered at runtime. In our experiments, the dynamic shields took a few minutes for their offline design, and took between a fraction of a second and a few seconds for online adaptation at each step, whereas the brute-force online recomputation approach was up to 5 times slower.
LGJun 10, 2024
Verification-Guided Shielding for Deep Reinforcement LearningDavide Corsi, Guy Amir, Andoni Rodriguez et al.
In recent years, Deep Reinforcement Learning (DRL) has emerged as an effective approach to solving real-world tasks. However, despite their successes, DRL-based policies suffer from poor reliability, which limits their deployment in safety-critical domains. Various methods have been put forth to address this issue by providing formal safety guarantees. Two main approaches include shielding and verification. While shielding ensures the safe behavior of the policy by employing an external online component (i.e., a ``shield'') that overrides potentially dangerous actions, this approach has a significant computational cost as the shield must be invoked at runtime to validate every decision. On the other hand, verification is an offline process that can identify policies that are unsafe, prior to their deployment, yet, without providing alternative actions when such a policy is deemed unsafe. In this work, we present verification-guided shielding -- a novel approach that bridges the DRL reliability gap by integrating these two methods. Our approach combines both formal and probabilistic verification tools to partition the input domain into safe and unsafe regions. In addition, we employ clustering and symbolic representation procedures that compress the unsafe regions into a compact representation. This, in turn, allows to temporarily activate the shield solely in (potentially) unsafe regions, in an efficient manner. Our novel approach allows to significantly reduce runtime overhead while still preserving formal safety guarantees. We extensively evaluate our approach on two benchmarks from the robotic navigation domain, as well as provide an in-depth analysis of its scalability and completeness.
LOJun 6, 2024
Shield Synthesis for LTL Modulo TheoriesAndoni Rodriguez, Guy Amir, Davide Corsi et al.
In recent years, Machine Learning (ML) models have achieved remarkable success in various domains. However, these models also tend to demonstrate unsafe behaviors, precluding their deployment in safety-critical systems. To cope with this issue, ample research focuses on developing methods that guarantee the safe behaviour of a given ML model. A prominent example is shielding which incorporates an external component (a ``shield'') that blocks unwanted behavior. Despite significant progress, shielding suffers from a main setback: it is currently geared towards properties encoded solely in propositional logics (e.g., LTL) and is unsuitable for richer logics. This, in turn, limits the widespread applicability of shielding in many real-world systems. In this work, we address this gap, and extend shielding to LTL modulo theories, by building upon recent advances in reactive synthesis modulo theories. This allowed us to develop a novel approach for generating shields conforming to complex safety specifications in these more expressive, logics. We evaluated our shields and demonstrate their ability to handle rich data with temporal dynamics. To the best of our knowledge, this is the first approach for synthesizing shields for such expressivity.
FLSep 18, 2020
Bounded Model Checking for HyperpropertiesTzu-Han Hsu, Cesar Sanchez, Borzoo Bonakdarpour
Hyperproperties are properties of systems that relate multiple computation traces, including security and concurrency properties. This paper introduces a bounded model checking (BMC) algorithm for hyperproperties expressed in HyperLTL, which - to the best of our knowledge - is the first such algorithm. Just as the classic BMC technique for LTL primarily aims at finding bugs, our approach also targets identifying counterexamples. BMC for LTL is reduced to SAT solving, because LTL describes a property via inspecting individual traces. HyperLTL allows explicit and simultaneous quantification over traces and describes properties that involves multiple traces and, hence, our BMC approach naturally reduces to QBF solving. We report on successful and efficient model checking, implemented in a tool called HyperQube, of a rich set of experiments on a variety of case studies, including security, concurrent data structures, path planning for robots, and testing.
SEFeb 28, 2020
Declarative Stream Runtime Verification (hLola)Martin Ceresa, Felipe Gorostiaga, Cesar Sanchez
Stream Runtime Verification is a formal dynamic analysis technique that generalizes runtime verification algorithms from temporal logics like LTL to stream monitoring, allowing to compute richer verdicts than Booleans (including quantitative and arbitrary data). In this paper we study the problem of implementing an SRV engine that is truly extensible to arbitrary data theories, and we propose a solution as a Haskell embedded domain specific language. In spite of the theoretical clean separation in SRV between temporal dependencies and data computations, previous engines include ad-hoc implementations of a few data types, requiring complex changes to incorporate new data theories. We propose here an SRV language called hLola that borrows general Haskell types and embeds them transparently into an eDSL. This novel technique, which we call lift deep embedding, allows for example, the use of higher-order functions for static stream parameterization. We describe the Haskell implementation of hLola and illustrate simple extensions implemented using libraries, which require long and error-prone additions in other ad-hoc SRV formalisms.
DCFeb 28, 2018
i2kit: A Tool for Immutable Infrastructure Deployments based on Lightweight Virtual Machines specialized to run ContainersPablo Chico de Guzman, Felipe Gorostiaga, Cesar Sanchez
Container technologies, like Docker, are becoming increasingly popular. Containers provide exceptional developer experience because containers offer lightweight isolation and ease of software distribution. Containers are also widely used in production environments, where a different set of challenges arise such as security, networking, service discovery and load balancing. Container cluster management tools, such as Kubernetes, attempt to solve these problems by introducing a new control layer with the container as the unit of deployment. However, adding a new control layer is an extra configuration step and an additional potential source of runtime errors. The virtual machine technology offered by cloud providers is more mature and proven in terms of security, networking, service discovery and load balancing. However, virtual machines are heavier than containers for local development, are less flexible for resource allocation, and suffer longer boot times. This paper presents an alternative to containers that enjoy the best features of both approaches: (1) the use of mature, proven cloud vendor technology; (2) no need for a new control layer; and (3) as lightweight as containers. Our solution is i2kit, a deployment tool based on the immutable infrastructure pattern, where the virtual machine is the unit of deployment. The i2kit tool accepts a simplified format of Kubernetes Deployment Manifests in order to reuse Kubernetes' most successful principles, but it creates a lightweight virtual machine for each Pod using Linuxkit. Linuxkit alleviates the drawback in size that using virtual machines would otherwise entail, because the footprint of Linuxkit is approximately 60MB. Finally, the attack surface of the system is reduced since Linuxkit only installs the minimum set of OS dependencies to run containers, and different Pods are isolated by hypervisor technology.
PLJan 6, 2012
Abstracting Runtime Heaps for Program UnderstandingMark Marron, Cesar Sanchez, Zhendong Su et al.
Modern programming environments provide extensive support for inspecting, analyzing, and testing programs based on the algorithmic structure of a program. Unfortunately, support for inspecting and understanding runtime data structures during execution is typically much more limited. This paper provides a general purpose technique for abstracting and summarizing entire runtime heaps. We describe the abstract heap model and the associated algorithms for transforming a concrete heap dump into the corresponding abstract model as well as algorithms for merging, comparing, and computing changes between abstract models. The abstract model is designed to emphasize high-level concepts about heap-based data structures, such as shape and size, as well as relationships between heap structures, such as sharing and connectivity. We demonstrate the utility and computational tractability of the abstract heap model by building a memory profiler. We then use this tool to check for, pinpoint, and correct sources of memory bloat from a suite of programs from DaCapo.