A Benchmark Comparison of Learned Control Policies for Agile Quadrotor Flight
This work addresses the challenge of sim-to-real transfer for quadrotor control, enabling faster and more robust flight, though it is incremental as it builds on existing learning-based methods.
The authors tackled the problem of training learned control policies for agile quadrotor flight by benchmarking existing methods, finding that a policy commanding body-rates and thrust leads to more robust sim-to-real transfer, and demonstrated real-world control at speeds over 45km/h.
Quadrotors are highly nonlinear dynamical systems that require carefully tuned controllers to be pushed to their physical limits. Recently, learning-based control policies have been proposed for quadrotors, as they would potentially allow learning direct mappings from high-dimensional raw sensory observations to actions. Due to sample inefficiency, training such learned controllers on the real platform is impractical or even impossible. Training in simulation is attractive but requires to transfer policies between domains, which demands trained policies to be robust to such domain gap. In this work, we make two contributions: (i) we perform the first benchmark comparison of existing learned control policies for agile quadrotor flight and show that training a control policy that commands body-rates and thrust results in more robust sim-to-real transfer compared to a policy that directly specifies individual rotor thrusts, (ii) we demonstrate for the first time that such a control policy trained via deep reinforcement learning can control a quadrotor in real-world experiments at speeds over 45km/h.