Censoring-Aware Tree-Based Reinforcement Learning for Estimating Dynamic Treatment Regimes with Censored Outcomes
This work addresses the challenge of personalized treatment strategies in healthcare, particularly for sequential decisions with censored data, representing an incremental advance in clinical decision-making.
The paper tackled the problem of estimating optimal dynamic treatment regimes with censored survival outcomes by proposing CA-TRL, a framework that enhanced tree-based reinforcement learning with AIPW and censoring-aware modifications, and demonstrated its effectiveness by outperforming the ASCL method in simulations and real-world epilepsy data, achieving improvements in restricted mean survival time and decision-making accuracy.
Dynamic Treatment Regimes (DTRs) provide a systematic approach for making sequential treatment decisions that adapt to individual patient characteristics, particularly in clinical contexts where survival outcomes are of interest. Censoring-Aware Tree-Based Reinforcement Learning (CA-TRL) is a novel framework to address the complexities associated with censored data when estimating optimal DTRs. We explore ways to learn effective DTRs, from observational data. By enhancing traditional tree-based reinforcement learning methods with augmented inverse probability weighting (AIPW) and censoring-aware modifications, CA-TRL delivers robust and interpretable treatment strategies. We demonstrate its effectiveness through extensive simulations and real-world applications using the SANAD epilepsy dataset, where it outperformed the recently proposed ASCL method in key metrics such as restricted mean survival time (RMST) and decision-making accuracy. This work represents a step forward in advancing personalized and data-driven treatment strategies across diverse healthcare settings.