AIApr 27, 2018

Expectation Optimization with Probabilistic Guarantees in POMDPs with Discounted-sum Objectives

arXiv:1804.10601v26 citations
Originality Incremental advance
AI Analysis

This work addresses risk-averse decision-making under uncertainty for applications in fields like robotics or finance, though it appears incremental by combining existing measures.

The paper tackles the problem of optimizing expected payoff in POMDPs while ensuring a minimum probability of meeting a payoff threshold, addressing the limitations of purely expectation-based or probability-based approaches. It presents the first algorithm to solve this expectation optimization with probabilistic guarantee problem.

Partially-observable Markov decision processes (POMDPs) with discounted-sum payoff are a standard framework to model a wide range of problems related to decision making under uncertainty. Traditionally, the goal has been to obtain policies that optimize the expectation of the discounted-sum payoff. A key drawback of the expectation measure is that even low probability events with extreme payoff can significantly affect the expectation, and thus the obtained policies are not necessarily risk-averse. An alternate approach is to optimize the probability that the payoff is above a certain threshold, which allows obtaining risk-averse policies, but ignores optimization of the expectation. We consider the expectation optimization with probabilistic guarantee (EOPG) problem, where the goal is to optimize the expectation ensuring that the payoff is above a given threshold with at least a specified probability. We present several results on the EOPG problem, including the first algorithm to solve it.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes