End-to-end Offline Reinforcement Learning for Glycemia Control
This work addresses the risk of simulator overfitting in diabetes management, offering a more reliable method for glycemia control in real-world scenarios.
The paper tackled the problem of overfitting to simulators in closed-loop glycemia control for type I diabetes by proposing offline reinforcement learning agents trained on real patient data, achieving improved performance and adaptability without needing a simulator.
The development of closed-loop systems for glycemia control in type I diabetes relies heavily on simulated patients. Improving the performances and adaptability of these close-loops raises the risk of over-fitting the simulator. This may have dire consequences, especially in unusual cases which were not faithfully-if at all-captured by the simulator. To address this, we propose to use offline RL agents, trained on real patient data, to perform the glycemia control. To further improve the performances, we propose an end-to-end personalization pipeline, which leverages offline-policy evaluation methods to remove altogether the need of a simulator, while still enabling an estimation of clinically relevant metrics for diabetes.