MLLGMay 13, 2018

GAN Q-learning

arXiv:1805.04874v320 citations
AI Analysis

This is an incremental improvement for reinforcement learning practitioners, offering a new distributional RL method using GANs.

The paper tackles the problem of leveraging distributional reinforcement learning in complex Markov Decision Processes by proposing GAN Q-learning, a novel method based on generative adversarial networks, and empirically shows it provides a viable alternative to traditional methods in simple tabular environments and OpenAI Gym.

Distributional reinforcement learning (distributional RL) has seen empirical success in complex Markov Decision Processes (MDPs) in the setting of nonlinear function approximation. However, there are many different ways in which one can leverage the distributional approach to reinforcement learning. In this paper, we propose GAN Q-learning, a novel distributional RL method based on generative adversarial networks (GANs) and analyze its performance in simple tabular environments, as well as OpenAI Gym. We empirically show that our algorithm leverages the flexibility and blackbox approach of deep learning models while providing a viable alternative to traditional methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes