LG AI OC MLSep 22, 2020

Is Q-Learning Provably Efficient? An Extended Analysis

Kushagra Rastogi, Jonathan Lee, Fabrice Harel-Canada, Aditya Joglekar

arXiv:2009.10396v11.2

Originality Synthesis-oriented

AI Analysis

It provides stronger theoretical guarantees for model-free reinforcement learning, which is incremental as it builds on prior work.

This paper extends the analysis of Q-learning's theoretical efficiency, showing that Q-learning with UCB exploration achieves sample efficiency matching the optimal regret of model-based approaches.

This work extends the analysis of the theoretical results presented within the paper Is Q-Learning Provably Efficient? by Jin et al. We include a survey of related research to contextualize the need for strengthening the theoretical guarantees related to perhaps the most important threads of model-free reinforcement learning. We also expound upon the reasoning used in the proofs to highlight the critical steps leading to the main result showing that Q-learning with UCB exploration achieves a sample efficiency that matches the optimal regret that can be achieved by any model-based approach.

View on arXiv PDF

Similar