LGMay 6, 2025

Joint Resource Management for Energy-efficient UAV-assisted SWIPT-MEC: A Deep Reinforcement Learning Approach

Yue Chen, Hui Kang, Jiahui Li, Geng Sun, Boxiong Wang, Jiacheng Wang, Cong Liang, Shuang Liang, Dusit Niyato

arXiv:2505.03230v29.412 citationsh-index: 19IEEE Internet of Things Journal

Originality Highly original

AI Analysis

It addresses energy efficiency and sustainability for IoT terminals in remote or disaster scenarios, representing an incremental improvement with a novel method for a known bottleneck.

This paper tackles the challenge of managing energy and computational resources in UAV-assisted SWIPT-MEC systems for 6G IoT networks, proposing a deep reinforcement learning approach that achieves efficient energy management and high computational performance in simulations.

The integration of simultaneous wireless information and power transfer (SWIPT) technology in 6G Internet of Things (IoT) networks faces significant challenges in remote areas and disaster scenarios where ground infrastructure is unavailable. This paper proposes a novel unmanned aerial vehicle (UAV)-assisted mobile edge computing (MEC) system enhanced by directional antennas to provide both computational resources and energy support for ground IoT terminals. However, such systems require multiple trade-off policies to balance UAV energy consumption, terminal battery levels, and computational resource allocation under various constraints, including limited UAV battery capacity, non-linear energy harvesting characteristics, and dynamic task arrivals. To address these challenges comprehensively, we formulate a bi-objective optimization problem that simultaneously considers system energy efficiency and terminal battery sustainability. We then reformulate this non-convex problem with a hybrid solution space as a Markov decision process (MDP) and propose an improved soft actor-critic (SAC) algorithm with an action simplification mechanism to enhance its convergence and generalization capabilities. Simulation results have demonstrated that our proposed approach outperforms various baselines in different scenarios, achieving efficient energy management while maintaining high computational performance. Furthermore, our method shows strong generalization ability across different scenarios, particularly in complex environments, validating the effectiveness of our designed boundary penalty and charging reward mechanisms.

View on arXiv PDF

Similar