GTOct 17, 2023
Partially Observable Stochastic Games with Neural Perception MechanismsRui Yan, Gabriel Santos, Gethin Norman et al.
Stochastic games are a well established model for multi-agent sequential decision making under uncertainty. In practical applications, though, agents often have only partial observability of their environment. Furthermore, agents increasingly perceive their environment using data-driven approaches such as neural networks trained on continuous data. We propose the model of neuro-symbolic partially-observable stochastic games (NS-POSGs), a variant of continuous-space concurrent stochastic games that explicitly incorporates neural perception mechanisms. We focus on a one-sided setting with a partially-informed agent using discrete, data-driven observations and another, fully-informed agent. We present a new method, called one-sided NS-HSVI, for approximate solution of one-sided NS-POSGs, which exploits the piecewise constant structure of the model. Using neural network pre-image analysis to construct finite polyhedral representations and particle-based representations for beliefs, we implement our approach and illustrate its practical applicability to the analysis of pedestrian-vehicle and pursuit-evasion scenarios.
SYJun 30, 2023
Point-Based Value Iteration for POMDPs with Neural Perception MechanismsRui Yan, Gabriel Santos, Gethin Norman et al.
The increasing trend to integrate neural networks and conventional software components in safety-critical settings calls for methodologies for their formal modelling, verification and correct-by-construction policy synthesis. We introduce neuro-symbolic partially observable Markov decision processes (NS-POMDPs), a variant of continuous-state POMDPs with discrete observations and actions, in which the agent perceives a continuous-state environment using a neural {\revise perception mechanism} and makes decisions symbolically. The perception mechanism classifies inputs such as images and sensor values into symbolic percepts, which are used in decision making. We study the problem of optimising discounted cumulative rewards for NS-POMDPs. Working directly with the continuous state space, we exploit the underlying structure of the model and the neural perception mechanism to propose a novel piecewise linear and convex representation (P-PWLC) in terms of polyhedra covering the state space and value vectors, and extend Bellman backups to this representation. We prove the convexity and continuity of value functions and present two value iteration algorithms that ensure finite representability. The first is a classical (exact) value iteration algorithm extending the $α$-functions of Porta {\em et al} (2006) to the P-PWLC representation for continuous-state spaces. The second is a point-based (approximate) method called NS-HSVI, which uses the P-PWLC representation and belief-value induced functions to approximate value functions from below and above for two types of beliefs, particle-based and region-based. Using a prototype implementation, we show the practical applicability of our approach on two case studies that employ (trained) ReLU neural networks as perception functions, by synthesising (approximately) optimal strategies.
GTApr 16, 2024
HSVI-based Online Minimax Strategies for Partially Observable Stochastic Games with Neural Perception MechanismsRui Yan, Gabriel Santos, Gethin Norman et al.
We consider a variant of continuous-state partially-observable stochastic games with neural perception mechanisms and an asymmetric information structure. One agent has partial information, with the observation function implemented as a neural network, while the other agent is assumed to have full knowledge of the state. We present, for the first time, an efficient online method to compute an $\varepsilon$-minimax strategy profile, which requires only one linear program to be solved for each agent at every stage, instead of a complex estimation of opponent counterfactual values. For the partially-informed agent, we propose a continual resolving approach which uses lower bounds, pre-computed offline with heuristic search value iteration (HSVI), instead of opponent counterfactual values. This inherits the soundness of continual resolving at the cost of pre-computing the bound. For the fully-informed agent, we propose an inferred-belief strategy, where the agent maintains an inferred belief about the belief of the partially-informed agent based on (offline) upper bounds from HSVI, guaranteeing $\varepsilon$-distance to the value of the game at the initial belief known to both agents.
GTNov 8, 2024
Expectation vs. Reality: Towards Verification of Psychological GamesMarta Kwiatkowska, Gethin Norman, David Parker et al.
Game theory provides an effective way to model strategic interactions among rational agents. In the context of formal verification, these ideas can be used to produce guarantees on the correctness of multi-agent systems, with a diverse range of applications from computer security to autonomous driving. Psychological games (PGs) were developed as a way to model and analyse agents with belief-dependent motivations, opening up the possibility to model how human emotions can influence behaviour. In PGs, players' utilities depend not only on what actually happens (which strategies players choose to adopt), but also on what the players had expected to happen (their belief as to the strategies that would be played). Despite receiving much attention in fields such as economics and psychology, very little consideration has been given to their applicability to problems in computer science, nor to practical algorithms and tool support. In this paper, we start to bridge that gap, proposing methods to solve PGs and implementing them within PRISM-games, a formal verification tool for stochastic games. We discuss how to model these games, highlight specific challenges for their analysis and illustrate the usefulness of our approach on several case studies, including human behaviour in traffic scenarios.
AIFeb 13, 2022
Strategy Synthesis for Zero-Sum Neuro-Symbolic Concurrent Stochastic GamesRui Yan, Gabriel Santos, Gethin Norman et al.
Neuro-symbolic approaches to artificial intelligence, which combine neural networks with classical symbolic techniques, are growing in prominence, necessitating formal approaches to reason about their correctness. We propose a novel modelling formalism called neuro-symbolic concurrent stochastic games (NS-CSGs), which comprise two probabilistic finite-state agents interacting in a shared continuous-state environment. Each agent observes the environment using a neural perception mechanism, which converts inputs such as images into symbolic percepts, and makes decisions symbolically. We focus on the class of NS-CSGs with Borel state spaces and prove the existence and measurability of the value function for zero-sum discounted cumulative rewards under piecewise-constant restrictions on the components of this class of models. To compute values and synthesise strategies, we present, for the first time, practical value iteration (VI) and policy iteration (PI) algorithms to solve this new subclass of continuous-state CSGs. These require a finite decomposition of the environment induced by the neural perception mechanisms of the agents and rely on finite abstract representations of value functions and strategies closed under VI or PI. First, we introduce a Borel measurable piecewise-constant (B-PWC) representation of value functions, extend minimax backups to this representation and propose a value iteration algorithm called B-PWC VI. Second, we introduce two novel representations for the value functions and strategies, constant-piecewise-linear (CON-PWL) and constant-piecewise-constant (CON-PWC) respectively, and propose Minimax-action-free PI by extending a recent PI method based on alternating player choices for finite state spaces to Borel state spaces, which does not require normal-form games to be solved.
ROSep 1, 2016
A Continuous-Time Model of an Autonomous Aerial Vehicle to Inform and Validate Formal Verification MethodsMurray L. Ireland, Ruth Hoffmann, Alice Miller et al.
If autonomous vehicles are to be widely accepted, we need to ensure their safe operation. For this reason, verification and validation (V&V) approaches must be developed that are suitable for this domain. Model checking is a formal technique which allows us to exhaustively explore the paths of an abstract model of a system. Using a probabilistic model checker such as PRISM, we may determine properties such as the expected time for a mission, or the probability that a specific mission failure occurs. However, model checking of complex systems is difficult due to the loss of information during abstraction. This is especially so when considering systems such as autonomous vehicles which are subject to external influences. An alternative solution is the use of Monte Carlo simulation to explore the results of a continuous-time model of the system. The main disadvantage of this approach is that the approach is not exhaustive as not all executions of the system are analysed. We are therefore interested in developing a framework for formal verification of autonomous vehicles, using Monte Carlo simulation to inform and validate our symbolic models during the initial stages of development. In this paper, we present a continuous-time model of a quadrotor unmanned aircraft undertaking an autonomous mission. We employ this model in Monte Carlo simulation to obtain specific mission properties which will inform the symbolic models employed in formal verification.
SYFeb 1, 2016
Autonomous Agent Behaviour Modelled in PRISM -- A Case StudyRuth Hoffmann, Murray Ireland, Alice Miller et al.
Formal verification of agents representing robot behaviour is a growing area due to the demand that autonomous systems have to be proven safe. In this paper we present an abstract definition of autonomy which can be used to model autonomous scenarios and propose the use of small-scale simulation models representing abstract actions to infer quantitative data. To demonstrate the applicability of the approach we build and verify a model of an unmanned aerial vehicle (UAV) in an exemplary autonomous scenario, utilising this approach.
LOJun 23, 2015
Verification and Control of Partially Observable Probabilistic Real-Time SystemsGethin Norman, David Parker, Xueyi Zou
We propose automated techniques for the verification and control of probabilistic real-time systems that are only partially observable. To formally model such systems, we define an extension of probabilistic timed automata in which local states are partially visible to an observer or controller. We give a probabilistic temporal logic that can express a range of quantitative properties of these models, relating to the probability of an event's occurrence or the expected value of a reward measure. We then propose techniques to either verify that such a property holds or to synthesise a controller for the model which makes it true. Our approach is based on an integer discretisation of the model's dense-time behaviour and a grid-based abstraction of the uncountable belief space induced by partial observability. The latter is necessarily approximate since the underlying problem is undecidable, however we show how both lower and upper bounds on numerical results can be generated. We illustrate the effectiveness of the approach by implementing it in the PRISM model checker and applying it to several case studies, from the domains of computer security and task scheduling.