AIROSYOCDec 4, 2020

Constrained Risk-Averse Markov Decision Processes

arXiv:2012.02423v231 citations
AI Analysis

This work provides a method for designing risk-averse policies in MDPs, which is important for applications where worst-case scenarios need to be mitigated, such as in autonomous systems or financial planning.

This paper addresses policy design for Markov Decision Processes (MDPs) with dynamic coherent risk objectives and constraints. The authors formulate the problem using a Lagrangian framework and propose an optimization-based method to synthesize Markovian policies that lower-bound the constrained risk-averse problem. The method is demonstrated on a rover navigation problem involving CVaR and EVaR risk measures.

We consider the problem of designing policies for Markov decision processes (MDPs) with dynamic coherent risk objectives and constraints. We begin by formulating the problem in a Lagrangian framework. Under the assumption that the risk objectives and constraints can be represented by a Markov risk transition mapping, we propose an optimization-based method to synthesize Markovian policies that lower-bound the constrained risk-averse problem. We demonstrate that the formulated optimization problems are in the form of difference convex programs (DCPs) and can be solved by the disciplined convex-concave programming (DCCP) framework. We show that these results generalize linear programs for constrained MDPs with total discounted expected costs and constraints. Finally, we illustrate the effectiveness of the proposed method with numerical experiments on a rover navigation problem involving conditional-value-at-risk (CVaR) and entropic-value-at-risk (EVaR) coherent risk measures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes