RO CV LGDec 9, 2024

Vision-Based Deep Reinforcement Learning of UAV Autonomous Navigation Using Privileged Information

Junqiao Wang, Zhongliang Yu, Dong Zhou, Jiaqi Shi, Runran Deng

arXiv:2412.06313v17.17 citationsh-index: 5

Originality Incremental advance

AI Analysis

This addresses the problem of efficient UAV navigation for applications like agriculture and disaster relief, but it appears incremental as it builds on existing deep reinforcement learning and privileged learning techniques.

The paper tackled the challenge of high-speed autonomous UAV navigation in partially observable environments by proposing the DPRL algorithm, which combines deep reinforcement learning with privileged learning, resulting in superior performance in flight efficiency, robustness, and success rate compared to state-of-the-art methods.

The capability of UAVs for efficient autonomous navigation and obstacle avoidance in complex and unknown environments is critical for applications in agricultural irrigation, disaster relief and logistics. In this paper, we propose the DPRL (Distributed Privileged Reinforcement Learning) navigation algorithm, an end-to-end policy designed to address the challenge of high-speed autonomous UAV navigation under partially observable environmental conditions. Our approach combines deep reinforcement learning with privileged learning to overcome the impact of observation data corruption caused by partial observability. We leverage an asymmetric Actor-Critic architecture to provide the agent with privileged information during training, which enhances the model's perceptual capabilities. Additionally, we present a multi-agent exploration strategy across diverse environments to accelerate experience collection, which in turn expedites model convergence. We conducted extensive simulations across various scenarios, benchmarking our DPRL algorithm against the state-of-the-art navigation algorithms. The results consistently demonstrate the superior performance of our algorithm in terms of flight efficiency, robustness and overall success rate.

View on arXiv PDF

Similar