AI GT LOFeb 13, 2022

Strategy Synthesis for Zero-Sum Neuro-Symbolic Concurrent Stochastic Games

Rui Yan, Gabriel Santos, Gethin Norman, David Parker, Marta Kwiatkowska

arXiv:2202.06255v79.011 citations

Originality Incremental advance

AI Analysis

This work addresses formal verification for neuro-symbolic systems, which is crucial for safety-critical applications, but it is incremental as it extends existing methods to a new model class.

The authors tackled the problem of ensuring correctness in neuro-symbolic AI by proposing neuro-symbolic concurrent stochastic games (NS-CSGs) with Borel state spaces, proving the existence of value functions and developing practical value and policy iteration algorithms for strategy synthesis.

Neuro-symbolic approaches to artificial intelligence, which combine neural networks with classical symbolic techniques, are growing in prominence, necessitating formal approaches to reason about their correctness. We propose a novel modelling formalism called neuro-symbolic concurrent stochastic games (NS-CSGs), which comprise two probabilistic finite-state agents interacting in a shared continuous-state environment. Each agent observes the environment using a neural perception mechanism, which converts inputs such as images into symbolic percepts, and makes decisions symbolically. We focus on the class of NS-CSGs with Borel state spaces and prove the existence and measurability of the value function for zero-sum discounted cumulative rewards under piecewise-constant restrictions on the components of this class of models. To compute values and synthesise strategies, we present, for the first time, practical value iteration (VI) and policy iteration (PI) algorithms to solve this new subclass of continuous-state CSGs. These require a finite decomposition of the environment induced by the neural perception mechanisms of the agents and rely on finite abstract representations of value functions and strategies closed under VI or PI. First, we introduce a Borel measurable piecewise-constant (B-PWC) representation of value functions, extend minimax backups to this representation and propose a value iteration algorithm called B-PWC VI. Second, we introduce two novel representations for the value functions and strategies, constant-piecewise-linear (CON-PWL) and constant-piecewise-constant (CON-PWC) respectively, and propose Minimax-action-free PI by extending a recent PI method based on alternating player choices for finite state spaces to Borel state spaces, which does not require normal-form games to be solved.

View on arXiv PDF

Similar