LG AIOct 18, 2020

Average-reward model-free reinforcement learning: a systematic review and literature mapping

Vektor Dewanto, George Dunn, Ali Eshragh, Marcus Gallagher, Fred Roosta

arXiv:2010.08920v213.641 citations

Originality Synthesis-oriented

AI Analysis

This is an incremental contribution that synthesizes existing research for the reinforcement learning community, identifying gaps for future work.

The paper provides an updated systematic review and literature mapping of model-free reinforcement learning using the average-reward optimality criterion, extending beyond a previous survey to include policy-iteration and function approximation methods.

Reinforcement learning is important part of artificial intelligence. In this paper, we review model-free reinforcement learning that utilizes the average reward optimality criterion in the infinite horizon setting. Motivated by the solo survey by Mahadevan (1996a), we provide an updated review of work in this area and extend it to cover policy-iteration and function approximation methods (in addition to the value-iteration and tabular counterparts). We present a comprehensive literature mapping. We also identify and discuss opportunities for future work.

View on arXiv PDF

Similar