LG AINov 10, 2025

Superhuman AI for Stratego Using Self-Play Reinforcement Learning and Test-Time Search

Samuel Sokota, Eugene Vinitsky, Hengyuan Hu, J. Zico Kolter, Gabriele Farina

arXiv:2511.07312v14 citationsh-index: 71

Originality Highly original

AI Analysis

This work solves a long-standing benchmark problem in AI for game-playing communities, representing a significant advance over prior costly and unsuccessful efforts.

The authors tackled the challenge of achieving superhuman performance in the board game Stratego, which involves strategic decision-making under hidden information, and demonstrated that their approach not only matches top human players but achieves vastly superhuman levels at a cost of only a few thousand dollars.

Few classical games have been regarded as such significant benchmarks of artificial intelligence as to have justified training costs in the millions of dollars. Among these, Stratego -- a board wargame exemplifying the challenge of strategic decision making under massive amounts of hidden information -- stands apart as a case where such efforts failed to produce performance at the level of top humans. This work establishes a step change in both performance and cost for Stratego, showing that it is now possible not only to reach the level of top humans, but to achieve vastly superhuman level -- and that doing so requires not an industrial budget, but merely a few thousand dollars. We achieved this result by developing general approaches for self-play reinforcement learning and test-time search under imperfect information.

View on arXiv PDF

Similar