Learning to Race in Minutes: Infoprop Dyna on the Mini Wheelbot
For roboticists, this demonstrates that model-based RL can achieve fast, direct real-world learning for unstable dynamics without simulators, though the method is incremental.
Infoprop Dyna, an uncertainty-aware MBRL framework, enables the Mini Wheelbot unicycle robot to learn to race around a track within 11 minutes of real-world experience, bypassing the need for physics-based simulators.
Reinforcement Learning (RL) has the potential to enable robots with fast, nonlinear, and unstable dynamics to reach the limits of their performance. However, most recent advances rely on carefully designed physics-based simulators and domain randomization to achieve successful sim-to-real transfer within reasonable wall-clock time. In this work, we bypass the need for such simulators and demonstrate that Infoprop Dyna, a state-of-the-art uncertainty-aware model-based reinforcement learning (MBRL) framework, can enable robots to learn directly from real-world interactions. Using Infoprop Dyna, the Mini Wheelbot, an underactuated unicycle robot, learns to race around a track within 11 minutes of real-world experience.