AIOCOct 18, 2020

Visibility Optimization for Surveillance-Evasion Games

arXiv:2010.09001v21 citations
Originality Incremental advance
AI Analysis

This work addresses the computational challenge of real-time strategy optimization in surveillance-evasion games, which is incremental as it builds on existing game theory and reinforcement learning methods.

The paper tackles the problem of computing optimal strategies in surveillance-evasion differential games, where a pursuer must maintain visibility of an evader, by proposing locally optimal strategies and training deep neural networks with Monte Carlo tree search and self-play reinforcement learning, achieving scalable policies for higher resolutions with sufficient computational resources.

We consider surveillance-evasion differential games, where a pursuer must try to constantly maintain visibility of a moving evader. The pursuer loses as soon as the evader becomes occluded. Optimal controls for game can be formulated as a Hamilton-Jacobi-Isaac equation. We use an upwind scheme to compute the feedback value function, corresponding to the end-game time of the differential game. Although the value function enables optimal controls, it is prohibitively expensive to compute, even for a single pursuer and single evader on a small grid. We consider a discrete variant of the surveillance-game. We propose two locally optimal strategies based on the static value function for the surveillance-evasion game with multiple pursuers and evaders. We show that Monte Carlo tree search and self-play reinforcement learning can train a deep neural network to generate reasonable strategies for on-line game play. Given enough computational resources and offline training time, the proposed model can continue to improve its policies and efficiently scale to higher resolutions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes