ROSep 29, 2017

Shaping in Practice: Training Wheels to Learn Fast Hopping Directly in Hardware

Steve Heim, Felix Ruppert, Alborz A. Sarvestani, Alexander Spröwitz

arXiv:1709.10273v25.68 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of unstable exploration in hardware-based robot learning for researchers and engineers, offering a novel approach but is incremental as it builds on existing learning methods.

The paper tackles the challenge of applying learning directly to hardware robots by introducing 'training wheels'—temporary physical modifications that shape the reward landscape to facilitate learning, demonstrated with a robot leg learning to hop fast, achieving a proof-of-concept with empirical mapping of the reward landscape.

Learning instead of designing robot controllers can greatly reduce engineering effort required, while also emphasizing robustness. Despite considerable progress in simulation, applying learning directly in hardware is still challenging, in part due to the necessity to explore potentially unstable parameters. We explore the concept of shaping the reward landscape with training wheels: temporary modifications of the physical hardware that facilitate learning. We demonstrate the concept with a robot leg mounted on a boom learning to hop fast. This proof of concept embodies typical challenges such as instability and contact, while being simple enough to empirically map out and visualize the reward landscape. Based on our results we propose three criteria for designing effective training wheels for learning in robotics. A video synopsis can be found at https://youtu.be/6iH5E3LrYh8.

View on arXiv PDF

Similar