LGNov 4, 2021

RLDS: an Ecosystem to Generate, Share and Use Datasets in Reinforcement Learning

arXiv:2111.02767v137 citations
Originality Synthesis-oriented
AI Analysis

This addresses the problem of dataset management and sharing for researchers in reinforcement learning, though it is incremental as it builds on existing tools like TFDS.

The paper introduces RLDS, an ecosystem for generating, sharing, and using datasets in reinforcement learning and related sequential decision-making fields, aiming to enhance reproducibility, accelerate research, and enable easy testing of algorithms across tasks.

We introduce RLDS (Reinforcement Learning Datasets), an ecosystem for recording, replaying, manipulating, annotating and sharing data in the context of Sequential Decision Making (SDM) including Reinforcement Learning (RL), Learning from Demonstrations, Offline RL or Imitation Learning. RLDS enables not only reproducibility of existing research and easy generation of new datasets, but also accelerates novel research. By providing a standard and lossless format of datasets it enables to quickly test new algorithms on a wider range of tasks. The RLDS ecosystem makes it easy to share datasets without any loss of information and to be agnostic to the underlying original format when applying various data processing pipelines to large collections of datasets. Besides, RLDS provides tools for collecting data generated by either synthetic agents or humans, as well as for inspecting and manipulating the collected data. Ultimately, integration with TFDS facilitates the sharing of RL datasets with the research community.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes