AISYOCJul 11, 2012

Discretized Approximations for POMDP with Average Cost

arXiv:1207.4154v159 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficiently solving average-cost POMDPs, which is important for applications in robotics and control, but it is incremental as it extends existing discounted-cost methods to the average-cost case.

The authors tackled the problem of approximating partially observable Markov decision processes (POMDPs) with average cost criteria by proposing a new lower approximation scheme that uses discretized belief points and efficient value iteration algorithms. They showed that this scheme provides lower bounds on optimal average costs and can compute upper bounds, with convergence proven under specific conditions like constant optimal average cost and continuous differential cost.

In this paper, we propose a new lower approximation scheme for POMDP with discounted and average cost criterion. The approximating functions are determined by their values at a finite number of belief points, and can be computed efficiently using value iteration algorithms for finite-state MDP. While for discounted problems several lower approximation schemes have been proposed earlier, ours seems the first of its kind for average cost problems. We focus primarily on the average cost case, and we show that the corresponding approximation can be computed efficiently using multi-chain algorithms for finite-state MDP. We give a preliminary analysis showing that regardless of the existence of the optimal average cost J in the POMDP, the approximation obtained is a lower bound of the liminf optimal average cost function, and can also be used to calculate an upper bound on the limsup optimal average cost function, as well as bounds on the cost of executing the stationary policy associated with the approximation. Weshow the convergence of the cost approximation, when the optimal average cost is constant and the optimal differential cost is continuous.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes