LGAug 30, 2022

An Analysis of Model-Based Reinforcement Learning From Abstracted Observations

arXiv:2208.14407v33 citationsh-index: 34
Originality Incremental advance
AI Analysis

This addresses a theoretical gap for researchers in reinforcement learning, but it is incremental as it extends prior work rather than introducing a new paradigm.

The paper tackles the problem of combining model-based reinforcement learning (MBRL) with state abstraction, where no guarantees existed, and shows that using concentration inequalities for martingales can extend existing MBRL guarantees to this setting, as illustrated with the R-MAX algorithm.

Many methods for Model-based Reinforcement learning (MBRL) in Markov decision processes (MDPs) provide guarantees for both the accuracy of the model they can deliver and the learning efficiency. At the same time, state abstraction techniques allow for a reduction of the size of an MDP while maintaining a bounded loss with respect to the original problem. Therefore, it may come as a surprise that no such guarantees are available when combining both techniques, i.e., where MBRL merely observes abstract states. Our theoretical analysis shows that abstraction can introduce a dependence between samples collected online (e.g., in the real world). That means that, without taking this dependence into account, results for MBRL do not directly extend to this setting. Our result shows that we can use concentration inequalities for martingales to overcome this problem. This result makes it possible to extend the guarantees of existing MBRL algorithms to the setting with abstraction. We illustrate this by combining R-MAX, a prototypical MBRL algorithm, with abstraction, thus producing the first performance guarantees for model-based 'RL from Abstracted Observations': model-based reinforcement learning with an abstract model.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes