MA AI RO SYJan 22, 2025

A Hierarchical Reinforcement Learning Framework for Multi-UAV Combat Using Leader-Follower Strategy

Jinhui Pang, Jinglin He, Noureldin Mohamed Abdelaal Ahmed Mohamed, Changqing Lin, Zhihui Zhang, Xiaoshuai Hao

arXiv:2501.13132v14.317 citationsh-index: 3Knowledge-Based Systems

Originality Incremental advance

AI Analysis

This addresses the challenge of high-dimensional action spaces and cooperative dynamics in multi-UAV combat for aerospace and AI applications, representing an incremental improvement over existing methods.

The paper tackles the problem of multi-UAV air combat by proposing a hierarchical reinforcement learning framework with a leader-follower strategy to enhance adversarial performance and cooperation, validated through simulation experiments.

Multi-UAV air combat is a complex task involving multiple autonomous UAVs, an evolving field in both aerospace and artificial intelligence. This paper aims to enhance adversarial performance through collaborative strategies. Previous approaches predominantly discretize the action space into predefined actions, limiting UAV maneuverability and complex strategy implementation. Others simplify the problem to 1v1 combat, neglecting the cooperative dynamics among multiple UAVs. To address the high-dimensional challenges inherent in six-degree-of-freedom space and improve cooperation, we propose a hierarchical framework utilizing the Leader-Follower Multi-Agent Proximal Policy Optimization (LFMAPPO) strategy. Specifically, the framework is structured into three levels. The top level conducts a macro-level assessment of the environment and guides execution policy. The middle level determines the angle of the desired action. The bottom level generates precise action commands for the high-dimensional action space. Moreover, we optimize the state-value functions by assigning distinct roles with the leader-follower strategy to train the top-level policy, followers estimate the leader's utility, promoting effective cooperation among agents. Additionally, the incorporation of a target selector, aligned with the UAVs' posture, assesses the threat level of targets. Finally, simulation experiments validate the effectiveness of our proposed method.

View on arXiv PDF

Similar