Online Learning to Estimate Warfarin Dose with Contextual Linear Bandits
This work addresses the problem of optimizing Warfarin dosing for patients to reduce trial-and-error and adverse effects, though it is incremental as it builds on existing bandit methods in a specific medical domain.
The paper tackled the challenge of predicting the correct initial dose of Warfarin, a widely used anticoagulant, by developing and evaluating linear bandit algorithms on real data from PharmGKB, where all proposed algorithms outperformed the fixed-dose baseline and some matched the Warfarin Clinical Dosing Algorithm.
Warfarin is one of the most commonly used oral blood anticoagulant agent in the world, the proper dose of Warfarin is difficult to establish not only because it is substantially variant among patients, but also adverse even severe consequences of taking an incorrect dose. Typical practice is to prescribe an initial dose, then doctor closely monitor patient response and adjust accordingly to the correct dosage. The three commonly used strategies for an initial dosage are the fixed-dose approach, the Warfarin Clinical algorithm, and the Pharmacogenetic algorithm developed by the IWPC (International Warfarin Pharmacogenetics Consortium). It is always best to prescribe correct initial dosage, motivated by this challenge, this work explores the performance of multi-armed bandit algorithms to best predict the correct dosage of Warfarin instead of trial-and-error procedure. Real data from the Pharmacogenetics and Pharmacogenomics Knowledge Base (PharmGKB) is used, with it a series of linear bandit algorithms and variants are developed and evaluated on Warfarin dataset. All proposed algorithms outperformed the fixed-dose baseline algorithm, and some even matched up the Warfarin Clinical Dosing Algorithm. In addition, a few promising future directions are given for further exploration and development.