Learning Unmanned Aerial Vehicle Control for Autonomous Target Following
This addresses the challenge of data-efficient and safe learning for real-world robotic applications, specifically UAV target following, though it appears incremental as it builds on existing deep RL and control methods.
The paper tackles the problem of learning unmanned aerial vehicle (UAV) control for tracking a moving target by developing a hierarchical approach combining a model-free policy gradient method with a conventional PID controller, and shows that the learned policy can be efficiently trained in a simulator and successfully transferred to a real-world DJI quadrotor platform.
While deep reinforcement learning (RL) methods have achieved unprecedented successes in a range of challenging problems, their applicability has been mainly limited to simulation or game domains due to the high sample complexity of the trial-and-error learning process. However, real-world robotic applications often need a data-efficient learning process with safety-critical constraints. In this paper, we consider the challenging problem of learning unmanned aerial vehicle (UAV) control for tracking a moving target. To acquire a strategy that combines perception and control, we represent the policy by a convolutional neural network. We develop a hierarchical approach that combines a model-free policy gradient method with a conventional feedback proportional-integral-derivative (PID) controller to enable stable learning without catastrophic failure. The neural network is trained by a combination of supervised learning from raw images and reinforcement learning from games of self-play. We show that the proposed approach can learn a target following policy in a simulator efficiently and the learned behavior can be successfully transferred to the DJI quadrotor platform for real-world UAV control.