LGROMLNov 20, 2019

Evaluating task-agnostic exploration for fixed-batch learning of arbitrary future tasks

arXiv:1911.08666v15 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of expensive robot experience generation for robotics applications by enabling offline learning from pre-collected datasets, though it is incremental as it evaluates existing methods rather than proposing new ones.

The paper tackles the problem of learning arbitrary tasks from a fixed dataset without further real-world interaction, evaluating popular exploration methods for offline reinforcement learning in robotics. It presents results on three simulated continuous control tasks and a real robot arm, with code and hyper-parameters publicly available.

Deep reinforcement learning has been shown to solve challenging tasks where large amounts of training experience is available, usually obtained online while learning the task. Robotics is a significant potential application domain for many of these algorithms, but generating robot experience in the real world is expensive, especially when each task requires a lengthy online training procedure. Off-policy algorithms can in principle learn arbitrary tasks from a diverse enough fixed dataset. In this work, we evaluate popular exploration methods by generating robotics datasets for the purpose of learning to solve tasks completely offline without any further interaction in the real world. We present results on three popular continuous control tasks in simulation, as well as continuous control of a high-dimensional real robot arm. Code documenting all algorithms, experiments, and hyper-parameters is available at https://github.com/qutrobotlearning/batchlearning.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes