Risk-Averse Planning Under Uncertainty
This work addresses risk management in uncertain environments for applications like robotics or finance, but it is incremental as it builds on existing POMDP and risk-averse planning methods.
The paper tackles the problem of designing risk-averse policies for partially observable Markov decision processes (POMDPs) with dynamic coherent risk objectives, which is undecidable due to infinite memory requirements, and proposes a method based on bounded policy iteration to produce sub-optimal solutions with lower coherent risk using finite state controllers.
We consider the problem of designing policies for partially observable Markov decision processes (POMDPs) with dynamic coherent risk objectives. Synthesizing risk-averse optimal policies for POMDPs requires infinite memory and thus undecidable. To overcome this difficulty, we propose a method based on bounded policy iteration for designing stochastic but finite state (memory) controllers, which takes advantage of standard convex optimization methods. Given a memory budget and optimality criterion, the proposed method modifies the stochastic finite state controller leading to sub-optimal solutions with lower coherent risk.