Enabling risk-aware Reinforcement Learning for medical interventions through uncertainty decomposition
This addresses the problem of risk-aware RL deployment for medical interventions, offering an incremental improvement by adapting distributional RL to uncertainty decomposition.
The paper tackles the challenge of deploying Reinforcement Learning (RL) in high-risk environments like healthcare by decomposing aleatoric and epistemic uncertainties, which are typically confounded in standard RL, and demonstrates this method in grid world examples and a clinical decision support system proof of concept.
Reinforcement Learning (RL) is emerging as tool for tackling complex control and decision-making problems. However, in high-risk environments such as healthcare, manufacturing, automotive or aerospace, it is often challenging to bridge the gap between an apparently optimal policy learnt by an agent and its real-world deployment, due to the uncertainties and risk associated with it. Broadly speaking RL agents face two kinds of uncertainty, 1. aleatoric uncertainty, which reflects randomness or noise in the dynamics of the world, and 2. epistemic uncertainty, which reflects the bounded knowledge of the agent due to model limitations and finite amount of information/data the agent has acquired about the world. These two types of uncertainty carry fundamentally different implications for the evaluation of performance and the level of risk or trust. Yet these aleatoric and epistemic uncertainties are generally confounded as standard and even distributional RL is agnostic to this difference. Here we propose how a distributional approach (UA-DQN) can be recast to render uncertainties by decomposing the net effects of each uncertainty. We demonstrate the operation of this method in grid world examples to build intuition and then show a proof of concept application for an RL agent operating as a clinical decision support system in critical care