AI SYFeb 23, 2023

Intermittently Observable Markov Decision Processes

arXiv:2302.11761v26.73 citationsh-index: 46

Originality Incremental advance

AI Analysis

This work addresses a practical problem in control systems where communication channels are unreliable, though it appears to be an incremental improvement on existing methods for handling partial observability.

This paper tackles the problem of finding optimal policies for Markov Decision Processes when state information is intermittently available due to unreliable communication channels, modeled as a Bernoulli lossy process. The authors develop finite-state approximations and a nested value iteration algorithm that is shown to be faster than standard value iteration, with numerical results demonstrating effectiveness.

This paper investigates MDPs with intermittent state information. We consider a scenario where the controller perceives the state information of the process via an unreliable communication channel. The transmissions of state information over the whole time horizon are modeled as a Bernoulli lossy process. Hence, the problem is finding an optimal policy for selecting actions in the presence of state information losses. We first formulate the problem as a belief MDP to establish structural results. The effect of state information losses on the expected total discounted reward is studied systematically. Then, we reformulate the problem as a tree MDP whose state space is organized in a tree structure. Two finite-state approximations to the tree MDP are developed to find near-optimal policies efficiently. Finally, we put forth a nested value iteration algorithm for the finite-state approximations, which is proved to be faster than standard value iteration. Numerical results demonstrate the effectiveness of our methods.

View on arXiv PDF

Similar