LG OCJul 28, 2015

A constrained optimization perspective on actor critic algorithms and application to network routing

Prashanth L. A., H. L. Prasad, Shalabh Bhatnagar, Prakash Chandra

arXiv:1507.07984v11.13 citationsh-index: 30

Originality Incremental advance

AI Analysis

This work addresses the need for reliable reinforcement learning algorithms in domains like network routing, though it appears incremental as it builds on existing actor-critic frameworks with a constrained optimization perspective.

The authors tackled the problem of designing a convergent actor-critic algorithm for discounted reward Markov decision processes, resulting in a novel method with guaranteed convergence to an optimal policy and demonstrated practicality in network routing applications.

We propose a novel actor-critic algorithm with guaranteed convergence to an optimal policy for a discounted reward Markov decision process. The actor incorporates a descent direction that is motivated by the solution of a certain non-linear optimization problem. We also discuss an extension to incorporate function approximation and demonstrate the practicality of our algorithms on a network routing application.

View on arXiv PDF

Similar