Learning to Recharge: UAV Coverage Path Planning through Deep Reinforcement Learning
This addresses the problem of efficient area coverage for UAVs with limited battery life, offering an incremental improvement in path planning through deep reinforcement learning.
The paper tackles the power-constrained coverage path planning problem for battery-limited UAVs by integrating recharge journeys into the strategy, proposing a novel PPO-based deep reinforcement learning approach that outperforms a baseline heuristic and generalizes to different target zones and maps.
Coverage path planning (CPP) is a critical problem in robotics, where the goal is to find an efficient path that covers every point in an area of interest. This work addresses the power-constrained CPP problem with recharge for battery-limited unmanned aerial vehicles (UAVs). In this problem, a notable challenge emerges from integrating recharge journeys into the overall coverage strategy, highlighting the intricate task of making strategic, long-term decisions. We propose a novel proximal policy optimization (PPO)-based deep reinforcement learning (DRL) approach with map-based observations, utilizing action masking and discount factor scheduling to optimize coverage trajectories over the entire mission horizon. We further provide the agent with a position history to handle emergent state loops caused by the recharge capability. Our approach outperforms a baseline heuristic, generalizes to different target zones and maps, with limited generalization to unseen maps. We offer valuable insights into DRL algorithm design for long-horizon problems and provide a publicly available software framework for the CPP problem.