LG MLNov 13, 2019

Kinematic State Abstraction and Provably Efficient Rich-Observation Reinforcement Learning

Dipendra Misra, Mikael Henaff, Akshay Krishnamurthy, John Langford

arXiv:1911.05815v128.5158 citations

Originality Highly original

AI Analysis

This addresses the challenge of sample-efficient exploration in complex, high-dimensional observation spaces for reinforcement learning practitioners, representing a significant advance rather than an incremental improvement.

The authors tackled the problem of reinforcement learning in environments with rich observations by introducing HOMER, an algorithm that uses kinematic state abstraction and strategic exploration, achieving sample complexity polynomial in latent states and time horizon without dependence on observation space size, and showing exponential sample efficiency gains over baselines in experiments.

We present an algorithm, HOMER, for exploration and reinforcement learning in rich observation environments that are summarizable by an unknown latent state space. The algorithm interleaves representation learning to identify a new notion of kinematic state abstraction with strategic exploration to reach new states using the learned abstraction. The algorithm provably explores the environment with sample complexity scaling polynomially in the number of latent states and the time horizon, and, crucially, with no dependence on the size of the observation space, which could be infinitely large. This exploration guarantee further enables sample-efficient global policy optimization for any reward function. On the computational side, we show that the algorithm can be implemented efficiently whenever certain supervised learning problems are tractable. Empirically, we evaluate HOMER on a challenging exploration problem, where we show that the algorithm is exponentially more sample efficient than standard reinforcement learning baselines.

View on arXiv PDF

Similar