LGSPSYSep 28, 2025

Integrated Communication and Control for Energy-Efficient UAV Swarms: A Multi-Agent Reinforcement Learning Approach

arXiv:2509.23905v11 citationsh-index: 3
Originality Incremental advance
AI Analysis

This work addresses energy efficiency and reliability issues for UAV swarm networks in emergency and remote scenarios, representing an incremental improvement with a novel algorithm.

The paper tackles the problem of unreliable air-to-ground links in UAV swarm-assisted communications in complex environments by proposing an integrated communication and control co-design mechanism, which achieves a fairness index of 0.99 and reduces energy consumption by up to 25% compared to baselines.

The deployment of unmanned aerial vehicle (UAV) swarm-assisted communication networks has become an increasingly vital approach for remediating coverage limitations in infrastructure-deficient environments, with especially pressing applications in temporary scenarios, such as emergency rescue, military and security operations, and remote area coverage. However, complex geographic environments lead to unpredictable and highly dynamic wireless channel conditions, resulting in frequent interruptions of air-to-ground (A2G) links that severely constrain the reliability and quality of service in UAV swarm-assisted mobile communications. To improve the quality of UAV swarm-assisted communications in complex geographic environments, we propose an integrated communication and control co-design mechanism. Given the stringent energy constraints inherent in UAV swarms, our proposed mechanism is designed to optimize energy efficiency while maintaining an equilibrium between equitable communication rates for mobile ground users (GUs) and UAV energy expenditure. We formulate the joint resource allocation and 3D trajectory control problem as a Markov decision process (MDP), and develop a multi-agent reinforcement learning (MARL) framework to enable real-time coordinated actions across the UAV swarm. To optimize the action policy of UAV swarms, we propose a novel multi-agent hybrid proximal policy optimization with action masking (MAHPPO-AM) algorithm, specifically designed to handle complex hybrid action spaces. The algorithm incorporates action masking to enforce hard constraints in high-dimensional action spaces. Experimental results demonstrate that our approach achieves a fairness index of 0.99 while reducing energy consumption by up to 25% compared to baseline methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes