AIROSep 24, 2017

Learning Unmanned Aerial Vehicle Control for Autonomous Target Following

arXiv:1709.08233v141 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of data-efficient and safe learning for real-world robotic applications, specifically UAV target following, though it appears incremental as it builds on existing deep RL and control methods.

The paper tackles the problem of learning unmanned aerial vehicle (UAV) control for tracking a moving target by developing a hierarchical approach combining a model-free policy gradient method with a conventional PID controller, and shows that the learned policy can be efficiently trained in a simulator and successfully transferred to a real-world DJI quadrotor platform.

While deep reinforcement learning (RL) methods have achieved unprecedented successes in a range of challenging problems, their applicability has been mainly limited to simulation or game domains due to the high sample complexity of the trial-and-error learning process. However, real-world robotic applications often need a data-efficient learning process with safety-critical constraints. In this paper, we consider the challenging problem of learning unmanned aerial vehicle (UAV) control for tracking a moving target. To acquire a strategy that combines perception and control, we represent the policy by a convolutional neural network. We develop a hierarchical approach that combines a model-free policy gradient method with a conventional feedback proportional-integral-derivative (PID) controller to enable stable learning without catastrophic failure. The neural network is trained by a combination of supervised learning from raw images and reinforcement learning from games of self-play. We show that the proposed approach can learn a target following policy in a simulator efficiently and the learned behavior can be successfully transferred to the DJI quadrotor platform for real-world UAV control.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes