LGMLFeb 15, 2019

Asynchronous Coagent Networks

arXiv:1902.05650v410 citations
Originality Incremental advance
AI Analysis

This work provides theoretical foundations for reinforcement learning algorithms, enabling easier development of hierarchical methods without custom derivations.

The paper proves that coagent policy gradient algorithms converge to locally optimal policies and extends the theory to asynchronous and recurrent networks, simplifying the design and analysis of hierarchical reinforcement learning algorithms like option-critic.

Coagent policy gradient algorithms (CPGAs) are reinforcement learning algorithms for training a class of stochastic neural networks called coagent networks. In this work, we prove that CPGAs converge to locally optimal policies. Additionally, we extend prior theory to encompass asynchronous and recurrent coagent networks. These extensions facilitate the straightforward design and analysis of hierarchical reinforcement learning algorithms like the option-critic, and eliminate the need for complex derivations of customized learning rules for these algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes