Using Simulation Optimization to Improve Zero-shot Policy Transfer of Quadrotors
This addresses the simulation-to-reality gap for quadrotor control, but it is incremental as it builds on existing methods for policy transfer.
The paper tackled the problem of transferring control policies from simulation to real-world quadrotors by optimizing simulation parameters, finding that low-level controllers trained with reinforcement learning need more accurate simulations than higher-level ones.
In this work, we propose a data-driven approach to optimize the parameters of a simulation such that control policies can be directly transferred from simulation to a real-world quadrotor. Our neural network-based policies take only onboard sensor data as input and run entirely on the embedded hardware. In extensive real-world experiments, we compare low-level Pulse-Width Modulated control with higher-level control structures such as Attitude Rate and Attitude, which utilize Proportional-Integral-Derivative controllers to output motor commands. Our experiments show that low-level controllers trained with reinforcement learning require a more accurate simulation than higher-level control policies.