AIROJun 18, 2018

Learning from Outside the Viability Kernel: Why we Should Build Robots that can Fall with Grace

arXiv:1806.06569v11 citations
Originality Incremental advance
AI Analysis

This addresses a key bottleneck in applying reinforcement learning to robotics, offering a novel approach to improve training efficiency, but it appears incremental as it builds on existing model-based methods.

The paper tackles the challenge of reward landscapes with large gradient-free patches in robot reinforcement learning by demonstrating that initializing robots in unviable states (doomed to fail) can surprisingly improve learning outcomes, though no concrete numbers are provided.

Despite impressive results using reinforcement learning to solve complex problems from scratch, in robotics this has still been largely limited to model-based learning with very informative reward functions. One of the major challenges is that the reward landscape often has large patches with no gradient, making it difficult to sample gradients effectively. We show here that the robot state-initialization can have a more important effect on the reward landscape than is generally expected. In particular, we show the counter-intuitive benefit of including initializations that are unviable, in other words initializing in states that are doomed to fail.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes