LG AIApr 6, 2023

Decision-Focused Model-based Reinforcement Learning for Reward Transfer

Abhishek Sharma, Sonali Parbhoo, Omer Gottesman, Finale Doshi-Velez

arXiv:2304.03365v32.01 citationsh-index: 56

Originality Incremental advance

AI Analysis

This addresses the need for simple, interpretable models in critical domains like healthcare, though it is incremental as it builds on existing MBRL methods.

The paper tackles the problem of model-based reinforcement learning (MBRL) being sensitive to reward function changes or suboptimal with restricted transition models, proposing a robust decision-focused (RDF) algorithm that learns transition models achieving high returns and robustness to reward changes, demonstrated on simulators and real patient data.

Model-based reinforcement learning (MBRL) provides a way to learn a transition model of the environment, which can then be used to plan personalized policies for different patient cohorts and to understand the dynamics involved in the decision-making process. However, standard MBRL algorithms are either sensitive to changes in the reward function or achieve suboptimal performance on the task when the transition model is restricted. Motivated by the need to use simple and interpretable models in critical domains such as healthcare, we propose a novel robust decision-focused (RDF) algorithm that learns a transition model that achieves high returns while being robust to changes in the reward function. We demonstrate our RDF algorithm can be used with several model classes and planning algorithms. We also provide theoretical and empirical evidence, on a variety of simulators and real patient data, that RDF can learn simple yet effective models that can be used to plan personalized policies.

View on arXiv PDF

Similar