LGAIOct 18, 2020

Average-reward model-free reinforcement learning: a systematic review and literature mapping

arXiv:2010.08920v241 citations
Originality Synthesis-oriented
AI Analysis

This is an incremental contribution that synthesizes existing research for the reinforcement learning community, identifying gaps for future work.

The paper provides an updated systematic review and literature mapping of model-free reinforcement learning using the average-reward optimality criterion, extending beyond a previous survey to include policy-iteration and function approximation methods.

Reinforcement learning is important part of artificial intelligence. In this paper, we review model-free reinforcement learning that utilizes the average reward optimality criterion in the infinite horizon setting. Motivated by the solo survey by Mahadevan (1996a), we provide an updated review of work in this area and extend it to cover policy-iteration and function approximation methods (in addition to the value-iteration and tabular counterparts). We present a comprehensive literature mapping. We also identify and discuss opportunities for future work.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes