LGDec 29, 2017

Boosting the Actor with Dual Critic

arXiv:1712.10282v156 citations
Originality Incremental advance
AI Analysis

This addresses a fundamental problem in reinforcement learning for researchers and practitioners by improving algorithm transparency and performance, though it appears incremental as an enhancement to existing actor-critic methods.

The paper tackles the challenge of aligning actor and critic updates in actor-critic algorithms by proposing Dual-AC, derived from the Lagrangian dual of the Bellman equation, which achieves state-of-the-art performance on several benchmarks.

This paper proposes a new actor-critic-style algorithm called Dual Actor-Critic or Dual-AC. It is derived in a principled way from the Lagrangian dual form of the Bellman optimality equation, which can be viewed as a two-player game between the actor and a critic-like function, which is named as dual critic. Compared to its actor-critic relatives, Dual-AC has the desired property that the actor and dual critic are updated cooperatively to optimize the same objective function, providing a more transparent way for learning the critic that is directly related to the objective function of the actor. We then provide a concrete algorithm that can effectively solve the minimax optimization problem, using techniques of multi-step bootstrapping, path regularization, and stochastic dual ascent algorithm. We demonstrate that the proposed algorithm achieves the state-of-the-art performances across several benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes