AISYFeb 23, 2023

Intermittently Observable Markov Decision Processes

arXiv:2302.11761v23 citationsh-index: 46
Originality Incremental advance
AI Analysis

This work addresses a practical problem in control systems where communication channels are unreliable, though it appears to be an incremental improvement on existing methods for handling partial observability.

This paper tackles the problem of finding optimal policies for Markov Decision Processes when state information is intermittently available due to unreliable communication channels, modeled as a Bernoulli lossy process. The authors develop finite-state approximations and a nested value iteration algorithm that is shown to be faster than standard value iteration, with numerical results demonstrating effectiveness.

This paper investigates MDPs with intermittent state information. We consider a scenario where the controller perceives the state information of the process via an unreliable communication channel. The transmissions of state information over the whole time horizon are modeled as a Bernoulli lossy process. Hence, the problem is finding an optimal policy for selecting actions in the presence of state information losses. We first formulate the problem as a belief MDP to establish structural results. The effect of state information losses on the expected total discounted reward is studied systematically. Then, we reformulate the problem as a tree MDP whose state space is organized in a tree structure. Two finite-state approximations to the tree MDP are developed to find near-optimal policies efficiently. Finally, we put forth a nested value iteration algorithm for the finite-state approximations, which is proved to be faster than standard value iteration. Numerical results demonstrate the effectiveness of our methods.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes