LG MLAug 27, 2018

On the convergence of optimistic policy iteration for stochastic shortest path problem

arXiv:1808.08763v28 citations

Originality Synthesis-oriented

AI Analysis

This work addresses convergence issues in reinforcement learning algorithms for stochastic shortest path problems, but it appears incremental as it focuses on a special case of an existing method.

The paper tackles the convergence of optimistic policy iteration for stochastic shortest path problems, proving results under conditions where the termination state is reached almost surely, using Monte Carlo and TD(λ) methods for policy evaluation.

In this paper, we prove some convergence results of a special case of optimistic policy iteration algorithm for stochastic shortest path problem. We consider both Monte Carlo and $TD(λ)$ methods for the policy evaluation step under the condition that the termination state will eventually be reached almost surely.

View on arXiv PDF

Similar