LGNov 18, 2016

Reinforcement Learning through Asynchronous Advantage Actor-Critic on a GPU

Mohammad Babaeizadeh, Iuri Frosio, Stephen Tyree, Jason Clemons, Jan Kautz

arXiv:1611.06256v325.0295 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses performance bottlenecks for researchers and practitioners using A3C in gaming tasks, though it is incremental as it optimizes an existing method.

The paper tackles the computational inefficiency of the Asynchronous Advantage Actor-Critic (A3C) algorithm in reinforcement learning by developing a hybrid CPU/GPU version, achieving a significant speed-up compared to CPU implementation.

We introduce a hybrid CPU/GPU version of the Asynchronous Advantage Actor-Critic (A3C) algorithm, currently the state-of-the-art method in reinforcement learning for various gaming tasks. We analyze its computational traits and concentrate on aspects critical to leveraging the GPU's computational power. We introduce a system of queues and a dynamic scheduling strategy, potentially helpful for other asynchronous algorithms as well. Our hybrid CPU/GPU version of A3C, based on TensorFlow, achieves a significant speed up compared to a CPU implementation; we make it publicly available to other researchers at https://github.com/NVlabs/GA3C .

View on arXiv PDF Code

Similar