LG MLOct 18, 2019

Autonomous exploration for navigating in non-stationary CMPs

Pratik Gajane, Ronald Ortner, Peter Auer, Csaba Szepesvari

arXiv:1910.08446v17.710 citations

Originality Incremental advance

AI Analysis

This addresses the challenge of efficient navigation in dynamic environments for reinforcement learning applications, but appears incremental as it builds on existing CMP frameworks.

The paper tackles the problem of learning to navigate in non-stationary controlled Markov processes where transition probabilities change abruptly, proposing a meta-algorithm called MNM and proving an upper bound on exploration steps in terms of the number of changes.

We consider a setting in which the objective is to learn to navigate in a controlled Markov process (CMP) where transition probabilities may abruptly change. For this setting, we propose a performance measure called exploration steps which counts the time steps at which the learner lacks sufficient knowledge to navigate its environment efficiently. We devise a learning meta-algorithm, MNM and prove an upper bound on the exploration steps in terms of the number of changes.

View on arXiv PDF

Similar