ROAILGAug 10, 2019

Learning to Explore in Motion and Interaction Tasks

arXiv:1908.03731v11 citations
AI Analysis

This addresses slow policy convergence in robotic motion and interaction tasks, offering an incremental improvement for more efficient learning.

The paper tackles the high sampling complexity in model-free reinforcement learning for robotic manipulation and locomotion tasks by developing an exploration strategy that leverages data from previously learned tasks. Experiments on a robot manipulator show this approach can more than double learning speed, particularly with sparse rewards.

Model free reinforcement learning suffers from the high sampling complexity inherent to robotic manipulation or locomotion tasks. Most successful approaches typically use random sampling strategies which leads to slow policy convergence. In this paper we present a novel approach for efficient exploration that leverages previously learned tasks. We exploit the fact that the same system is used across many tasks and build a generative model for exploration based on data from previously solved tasks to improve learning new tasks. The approach also enables continuous learning of improved exploration strategies as novel tasks are learned. Extensive simulations on a robot manipulator performing a variety of motion and contact interaction tasks demonstrate the capabilities of the approach. In particular, our experiments suggest that the exploration strategy can more than double learning speed, especially when rewards are sparse. Moreover, the algorithm is robust to task variations and parameter tuning, making it beneficial for complex robotic problems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes