Deep Reinforcement Learning for Weapons to Targets Assignment in a Hypersonic strike
This addresses real-time autonomous decision-making in military missions, though it appears incremental as it applies existing RL methods to a specific domain.
The paper tackles the problem of optimizing weapons-to-target assignment for hypersonic strikes using deep reinforcement learning, achieving near-optimal performance with a 1000x speedup in computation time compared to a non-linear integer programming benchmark.
We use deep reinforcement learning (RL) to optimize a weapons to target assignment (WTA) policy for multi-vehicle hypersonic strike against multiple targets. The objective is to maximize the total value of destroyed targets in each episode. Each randomly generated episode varies the number and initial conditions of the hypersonic strike weapons (HSW) and targets, the value distribution of the targets, and the probability of a HSW being intercepted. We compare the performance of this WTA policy to that of a benchmark WTA policy derived using non-linear integer programming (NLIP), and find that the RL WTA policy gives near optimal performance with a 1000X speedup in computation time, allowing real time operation that facilitates autonomous decision making in the mission end game.