LGOct 18, 2024

HR-Bandit: Human-AI Collaborated Linear Recourse Bandit

Junyu Cao, Ruijiang Gao, Esmaeil Keyvanshokooh

arXiv:2410.14640v211.54 citationsh-index: 10AISTATS

Originality Incremental advance

AI Analysis

This work addresses the need for efficient human-AI collaboration in domains like healthcare, where actionable recourses are critical, though it appears incremental by extending existing bandit methods with human integration.

The paper tackles the problem of optimizing actionable recourses in human-AI collaboration scenarios, such as healthcare, by proposing the HR-Bandit algorithm, which integrates human expertise to achieve improved performance with guarantees like warm-start and sublinear regret, validated empirically against benchmarks.

Human doctors frequently recommend actionable recourses that allow patients to modify their conditions to access more effective treatments. Inspired by such healthcare scenarios, we propose the Recourse Linear UCB ($\textsf{RLinUCB}$) algorithm, which optimizes both action selection and feature modifications by balancing exploration and exploitation. We further extend this to the Human-AI Linear Recourse Bandit ($\textsf{HR-Bandit}$), which integrates human expertise to enhance performance. $\textsf{HR-Bandit}$ offers three key guarantees: (i) a warm-start guarantee for improved initial performance, (ii) a human-effort guarantee to minimize required human interactions, and (iii) a robustness guarantee that ensures sublinear regret even when human decisions are suboptimal. Empirical results, including a healthcare case study, validate its superior performance against existing benchmarks.

View on arXiv PDF

Similar