LGMar 25, 2022

A Conservative Q-Learning approach for handling distribution shift in sepsis treatment strategies

Pramod Kaushik, Sneha Kummetha, Perusha Moodley, Raju S. Bapi

arXiv:2203.13884v14.615 citationsh-index: 23

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of optimizing sepsis treatment strategies for clinicians in Intensive Care Units, though it is incremental as it applies an existing CQL method to a healthcare domain.

The paper tackled the problem of distribution shift in offline reinforcement learning for sepsis treatment by using a Conservative Q-Learning (CQL) algorithm, resulting in a policy that more closely matches physician actions compared to conventional deep Q-learning.

Sepsis is a leading cause of mortality and its treatment is very expensive. Sepsis treatment is also very challenging because there is no consensus on what interventions work best and different patients respond very differently to the same treatment. Deep Reinforcement Learning methods can be used to come up with optimal policies for treatment strategies mirroring physician actions. In the healthcare scenario, the available data is mostly collected offline with no interaction with the environment, which necessitates the use of offline RL techniques. The Offline RL paradigm suffers from action distribution shifts which in turn negatively affects learning an optimal policy for the treatment. In this work, a Conservative-Q Learning (CQL) algorithm is used to mitigate this shift and its corresponding policy reaches closer to the physicians policy than conventional deep Q Learning. The policy learned could help clinicians in Intensive Care Units to make better decisions while treating septic patients and improve survival rate.

View on arXiv PDF

Similar