50.3SYMay 11
Convex Computations for Controlled Safety Invariant Sets of Black-box Discrete-time Dynamical SystemsTaoran Wu, Yiling Xue, Jingduo Pan et al.
Identifying controlled safety invariant sets (CSISs) is essential for safety-critical systems. This paper addresses the problem of computing CSISs for black-box discrete-time systems, where the dynamics are unknown and only limited simulation data are available. Traditionally, a CSIS requires that for every state in the set, there exists a control input that keeps the system within the set at the next step. However, enforcing such universal invariance, i.e., requiring the set to remain controlled invariant for all states, is often overly restrictive or impractical for black-box systems. To address this, we introduce the notion of a Probably Approximately Correct (PAC) CSIS, in which, with prescribed confidence, there exists a suitable control input to keep the system within the set at the next step for at least a specified fraction of the states. Our approach leverages barrier functions and scenario optimization, yielding a tractable linear programming method for estimating PAC CSISs. Several illustrative examples demonstrate the effectiveness of the proposed framework.
82.5SYMay 23
Cost-Aware Adaptive Conformal Inference for Runtime Assurance in Dynamic EnvironmentsTaoran Wu, Jingduo Pan, Luke Ong et al.
This paper addresses the problem of providing runtime assurance for systems operating online under unknown and potentially time-varying data distributions. We propose Cost-Aware Adaptive Conformal Inference (ACI), a novel framework that incorporates constraint violation costs directly into the conformal adaptation mechanism. Our key insight is that uncertainty margins should adapt not only to the frequency of constraint violations but also to their severity. We formalize this through a cost-aware loss function that couples the miscoverage indicator with violation costs. Unlike existing methods that regulate a single controlled metric, our approach provides a dual statistical guarantee: simultaneously bounding the long-run average violation frequencies (reliability) and cumulative violation cost (harm). By weighting prediction failures according to their severity, the algorithm enables the controller to respond proportionally to violation severity, expanding prediction sets aggressively when necessary while maintaining efficiency during nominal operation. We integrate Cost-Aware ACI into a robust control synthesis framework, creating a closed-loop system that balances task performance with runtime risk control without requiring explicit model knowledge. Experiments validate its effectiveness for online risk-aware controller synthesis.
49.8LGMay 12
Stochastic Minimum-Cost Reach-Avoid Reinforcement LearningJingduo Pan, Taoran Wu, Yiling Xue et al.
We study stochastic minimum-cost reach-avoid reinforcement learning, where an agent must satisfy a reach-avoid specification with probability at least $p$ while minimizing expected cumulative costs in stochastic environments. Existing safe and constrained reinforcement learning methods typically fail to jointly enforce probabilistic reach-avoid constraints and optimize cost in the learning setting in stochastic environments. To address this challenge, we introduce reach-avoid probability certificates (RAPCs), which identify states from which stochastic reach-avoid constraints are satisfiable. Building on RAPCs, we develop a contraction-based Bellman formulation that serves as a principled surrogate for integrating reach-avoid considerations into reinforcement learning, enabling cost optimization under probabilistic constraints. We establish almost sure convergence of the proposed algorithms to locally optimal policies with respect to the resulting objective. Experiments in the MuJoCo simulator demonstrate improved cost performance and consistently higher reach-avoid satisfaction rates.