LGAIMLFeb 20, 2019

Curiosity-Driven Experience Prioritization via Density Estimation

arXiv:1902.08039v361 citations
AI Analysis

This addresses data imbalance in reinforcement learning for robotics, offering an incremental improvement over existing techniques.

The paper tackles the problem of imbalanced data in reinforcement learning by proposing a Curiosity-Driven Prioritization (CDP) framework to oversample trajectories with rare goal states, which improves performance and sample-efficiency in robotic manipulation tasks compared to state-of-the-art methods.

In Reinforcement Learning (RL), an agent explores the environment and collects trajectories into the memory buffer for later learning. However, the collected trajectories can easily be imbalanced with respect to the achieved goal states. The problem of learning from imbalanced data is a well-known problem in supervised learning, but has not yet been thoroughly researched in RL. To address this problem, we propose a novel Curiosity-Driven Prioritization (CDP) framework to encourage the agent to over-sample those trajectories that have rare achieved goal states. The CDP framework mimics the human learning process and focuses more on relatively uncommon events. We evaluate our methods using the robotic environment provided by OpenAI Gym. The environment contains six robot manipulation tasks. In our experiments, we combined CDP with Deep Deterministic Policy Gradient (DDPG) with or without Hindsight Experience Replay (HER). The experimental results show that CDP improves both performance and sample-efficiency of reinforcement learning agents, compared to state-of-the-art methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes