LGNov 18, 2016

Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU

arXiv:1611.06256v3295 citationsHas Code
Originality Incremental advance
AI Analysis

This work addresses performance bottlenecks for researchers and practitioners using A3C in gaming tasks, though it is incremental as it optimizes an existing method.

The paper tackles the computational inefficiency of the Asynchronous Advantage Actor-Critic (A3C) algorithm in reinforcement learning by developing a hybrid CPU/GPU version, achieving a significant speed-up compared to CPU implementation.

We introduce a hybrid CPU/GPU version of the Asynchronous Advantage Actor-Critic (A3C) algorithm, currently the state-of-the-art method in reinforcement learning for various gaming tasks. We analyze its computational traits and concentrate on aspects critical to leveraging the GPU's computational power. We introduce a system of queues and a dynamic scheduling strategy, potentially helpful for other asynchronous algorithms as well. Our hybrid CPU/GPU version of A3C, based on TensorFlow, achieves a significant speed up compared to a CPU implementation; we make it publicly available to other researchers at https://github.com/NVlabs/GA3C .

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes