LGOCJul 28, 2015

A constrained optimization perspective on actor critic algorithms and application to network routing

arXiv:1507.07984v13 citations
Originality Incremental advance
AI Analysis

This work addresses the need for reliable reinforcement learning algorithms in domains like network routing, though it appears incremental as it builds on existing actor-critic frameworks with a constrained optimization perspective.

The authors tackled the problem of designing a convergent actor-critic algorithm for discounted reward Markov decision processes, resulting in a novel method with guaranteed convergence to an optimal policy and demonstrated practicality in network routing applications.

We propose a novel actor-critic algorithm with guaranteed convergence to an optimal policy for a discounted reward Markov decision process. The actor incorporates a descent direction that is motivated by the solution of a certain non-linear optimization problem. We also discuss an extension to incorporate function approximation and demonstrate the practicality of our algorithms on a network routing application.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes