RO LG MLOct 26, 2020

High Acceleration Reinforcement Learning for Real-World Juggling with Binary Rewards

arXiv:2010.13483v316.333 citations

Originality Incremental advance

AI Analysis

This addresses the problem of real-world robot learning for dynamic tasks, offering a practical solution with incremental improvements in sample efficiency and safety.

The paper tackles the challenge of enabling robots to learn high-acceleration tasks like juggling in the real world by proposing a learning system that incorporates policy representation, initialization, and optimization for sample efficiency and safety, achieving juggling from 56 minutes of experience with a binary reward signal and continuous juggling for up to 33 minutes or 4500 catches.

Robots that can learn in the physical world will be important to en-able robots to escape their stiff and pre-programmed movements. For dynamic high-acceleration tasks, such as juggling, learning in the real-world is particularly challenging as one must push the limits of the robot and its actuation without harming the system, amplifying the necessity of sample efficiency and safety for robot learning algorithms. In contrast to prior work which mainly focuses on the learning algorithm, we propose a learning system, that directly incorporates these requirements in the design of the policy representation, initialization, and optimization. We demonstrate that this system enables the high-speed Barrett WAM manipulator to learn juggling two balls from 56 minutes of experience with a binary reward signal. The final policy juggles continuously for up to 33 minutes or about 4500 repeated catches. The videos documenting the learning process and the evaluation can be found at https://sites.google.com/view/jugglingbot

View on arXiv PDF

Similar