LGMar 2, 2018

Distributed Prioritized Experience Replay

arXiv:1803.00933v1820 citations
Originality Highly original
AI Analysis

This work addresses the problem of inefficient data usage and slow training times in deep reinforcement learning for researchers and practitioners, representing a strong specific gain rather than a foundational breakthrough.

The paper tackled the challenge of scaling deep reinforcement learning by proposing a distributed architecture that decouples acting from learning, enabling agents to learn from orders of magnitude more data and achieving better final performance on the Arcade Learning Environment in a fraction of the wall-clock training time.

We propose a distributed architecture for deep reinforcement learning at scale, that enables agents to learn effectively from orders of magnitude more data than previously possible. The algorithm decouples acting from learning: the actors interact with their own instances of the environment by selecting actions according to a shared neural network, and accumulate the resulting experience in a shared experience replay memory; the learner replays samples of experience and updates the neural network. The architecture relies on prioritized experience replay to focus only on the most significant data generated by the actors. Our architecture substantially improves the state of the art on the Arcade Learning Environment, achieving better final performance in a fraction of the wall-clock training time.

Code Implementations15 repos

Data from Papers with Code (CC-BY-SA-4.0)

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes