AINov 5, 2021

Regular Decision Processes for Grid Worlds

arXiv:2111.03647v2
Originality Synthesis-oriented
AI Analysis

This work addresses the need for more flexible and verifiable decision-making models in reinforcement learning, though it appears incremental as it builds on existing regular decision process frameworks.

The paper tackles the problem of sequential decision making under uncertainty with non-Markovian dependencies by experimentally investigating regular decision processes, providing a tool chain, algorithmic extensions, and empirical evaluations in grid worlds.

Markov decision processes are typically used for sequential decision making under uncertainty. For many aspects however, ranging from constrained or safe specifications to various kinds of temporal (non-Markovian) dependencies in task and reward structures, extensions are needed. To that end, in recent years interest has grown into combinations of reinforcement learning and temporal logic, that is, combinations of flexible behavior learning methods with robust verification and guarantees. In this paper we describe an experimental investigation of the recently introduced regular decision processes that support both non-Markovian reward functions as well as transition functions. In particular, we provide a tool chain for regular decision processes, algorithmic extensions relating to online, incremental learning, an empirical evaluation of model-free and model-based solution algorithms, and applications in regular, but non-Markovian, grid worlds.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes