GTApr 13

Incremental Data-Driven Policy Synthesis via Game Abstractions

Irmak Sağlam, Mahdi Nazeri, Alessandro Abate, Sadegh Soudjani, Anne-Kathrin Schmuck

arXiv:2511.1154532.31 citationsh-index: 31

AI Analysis

For control engineers working with unknown stochastic systems, this work offers an incremental approach that reduces computational cost, though it is an incremental improvement over existing abstraction-based methods.

This paper presents a data-driven framework for synthesizing control policies for unknown stochastic systems to satisfy temporal logic objectives, using incremental game abstractions. The method achieves significant computational savings compared to re-solving the entire game from scratch when new data arrives.

We address the synthesis of control policies for unknown discrete-time stochastic dynamical systems to satisfy temporal logic objectives. We present a data-driven, abstraction-based control framework that integrates online learning with novel incremental game-solving. Under appropriate continuity assumptions, our method abstracts the system dynamics into a finite stochastic (2.5-player) game graph derived from data. Given a requirement over time on this graph, we compute the winning region -- i.e., the set of initial states from which the objective is satisfiable -- in the resulting game, together with a corresponding control policy. Our main contribution is the construction of abstractions, winning regions and control policies \emph{incrementally}, as data about the system dynamics accumulates. Concretely, our algorithm refines under- and over-approximations of reachable sets for each state-action pair as new data samples arrive. These refinements induce structural modifications in the game graph abstraction -- such as the addition or removal of nodes and edges -- which in turn modify the winning region. Crucially, we show that these updates are inherently monotonic: under-approximations only grow, over-approximations only shrink, and the winning region only expands. We exploit this monotonicity by defining an objective-induced ranking function on the nodes of the abstract game that increases monotonically as new data samples are incorporated. These ranks underpin our novel incremental game-solving algorithm, which employs customized gadgets (DAG-like subgames) within a rank-lifting algorithm to efficiently update the winning region. Numerical case studies demonstrate significant computational savings compared to the baseline approach, which re-solves the entire game from scratch whenever new data samples arrive.

View on arXiv PDF

Similar