RO CV LGMar 18, 2024

Bootstrapping Reinforcement Learning with Imitation for Vision-Based Agile Flight

Jiaxu Xing, Angel Romero, Leonard Bauersfeld, Davide Scaramuzza

arXiv:2403.12203v321.348 citationsh-index: 15CoRL

Originality Incremental advance

AI Analysis

This work addresses the problem of inefficient policy exploration in vision-based autonomous drone racing for robotics researchers, presenting an incremental improvement by integrating existing methods.

The paper tackles the challenge of learning visuomotor policies for agile quadrotor flight by combining reinforcement learning (RL) and imitation learning (IL) to improve sample efficiency and performance, achieving successful navigation in simulated and real-world drone racing scenarios using only visual inputs.

Learning visuomotor policies for agile quadrotor flight presents significant difficulties, primarily from inefficient policy exploration caused by high-dimensional visual inputs and the need for precise and low-latency control. To address these challenges, we propose a novel approach that combines the performance of Reinforcement Learning (RL) and the sample efficiency of Imitation Learning (IL) in the task of vision-based autonomous drone racing. While RL provides a framework for learning high-performance controllers through trial and error, it faces challenges with sample efficiency and computational demands due to the high dimensionality of visual inputs. Conversely, IL efficiently learns from visual expert demonstrations, but it remains limited by the expert's performance and state distribution. To overcome these limitations, our policy learning framework integrates the strengths of both approaches. Our framework contains three phases: training a teacher policy using RL with privileged state information, distilling it into a student policy via IL, and adaptive fine-tuning via RL. Testing in both simulated and real-world scenarios shows our approach can not only learn in scenarios where RL from scratch fails but also outperforms existing IL methods in both robustness and performance, successfully navigating a quadrotor through a race course using only visual information. Videos of the experiments are available at https://rpg.ifi.uzh.ch/bootstrap-rl-with-il/index.html.

View on arXiv PDF

Similar