GT AIDec 5, 2025

On Dynamic Programming Theory for Leader-Follower Stochastic Games

Jilles Steeve Dibangoye, Thibaut Le Marre, Ocan Sankur, François Schwarzentruber

arXiv:2512.05667v11.2

Originality Incremental advance

AI Analysis

This work addresses sequential decision-making under asymmetric commitment for applications like security games and resource allocation, though it is incremental as it builds on existing game theory concepts with algorithmic improvements.

The paper tackled the problem of computing strong Stackelberg equilibria in leader-follower stochastic games by introducing a dynamic programming framework over credible sets, proving a reduction to Markov decision processes and developing ε-optimal algorithms with empirical gains in leader value and runtime scalability over state-of-the-art methods.

Leader-follower general-sum stochastic games (LF-GSSGs) model sequential decision-making under asymmetric commitment, where a leader commits to a policy and a follower best responds, yielding a strong Stackelberg equilibrium (SSE) with leader-favourable tie-breaking. This paper introduces a dynamic programming (DP) framework that applies Bellman recursion over credible sets-state abstractions formally representing all rational follower best responses under partial leader commitments-to compute SSEs. We first prove that any LF-GSSG admits a lossless reduction to a Markov decision process (MDP) over credible sets. We further establish that synthesising an optimal memoryless deterministic leader policy is NP-hard, motivating the development of ε-optimal DP algorithms with provable guarantees on leader exploitability. Experiments on standard mixed-motive benchmarks-including security games, resource allocation, and adversarial planning-demonstrate empirical gains in leader value and runtime scalability over state-of-the-art methods.

View on arXiv PDF

Similar