LG AIFeb 8, 2024

Scaling Intelligent Agents in Combat Simulations for Wargaming

arXiv:2402.06694v13 citationsh-index: 2

Originality Synthesis-oriented

AI Analysis

This research addresses the problem of scaling AI agents in military wargaming for defense planners, but it is incremental as it builds on existing HRL methods.

The paper tackles the challenge of achieving superhuman performance in combat simulations for wargaming by extending hierarchical reinforcement learning (HRL) to manage computational complexity, with initial results focusing on framework development and state space abstractions.

Remaining competitive in future conflicts with technologically-advanced competitors requires us to accelerate our research and development in artificial intelligence (AI) for wargaming. More importantly, leveraging machine learning for intelligent combat behavior development will be key to one day achieving superhuman performance in this domain--elevating the quality and accelerating the speed of our decisions in future wars. Although deep reinforcement learning (RL) continues to show promising results in intelligent agent behavior development in games, it has yet to perform at or above the human level in the long-horizon, complex tasks typically found in combat modeling and simulation. Capitalizing on the proven potential of RL and recent successes of hierarchical reinforcement learning (HRL), our research is investigating and extending the use of HRL to create intelligent agents capable of performing effectively in these large and complex simulation environments. Our ultimate goal is to develop an agent capable of superhuman performance that could then serve as an AI advisor to military planners and decision-makers. This papers covers our ongoing approach and the first three of our five research areas aimed at managing the exponential growth of computations that have thus far limited the use of AI in combat simulations: (1) developing an HRL training framework and agent architecture for combat units; (2) developing a multi-model framework for agent decision-making; (3) developing dimension-invariant observation abstractions of the state space to manage the exponential growth of computations; (4) developing an intrinsic rewards engine to enable long-term planning; and (5) implementing this framework into a higher-fidelity combat simulation.

View on arXiv PDF

Similar