LGOct 18, 2024

HR-Bandit: Human-AI Collaborated Linear Recourse Bandit

arXiv:2410.14640v24 citationsh-index: 10AISTATS
Originality Incremental advance
AI Analysis

This work addresses the need for efficient human-AI collaboration in domains like healthcare, where actionable recourses are critical, though it appears incremental by extending existing bandit methods with human integration.

The paper tackles the problem of optimizing actionable recourses in human-AI collaboration scenarios, such as healthcare, by proposing the HR-Bandit algorithm, which integrates human expertise to achieve improved performance with guarantees like warm-start and sublinear regret, validated empirically against benchmarks.

Human doctors frequently recommend actionable recourses that allow patients to modify their conditions to access more effective treatments. Inspired by such healthcare scenarios, we propose the Recourse Linear UCB ($\textsf{RLinUCB}$) algorithm, which optimizes both action selection and feature modifications by balancing exploration and exploitation. We further extend this to the Human-AI Linear Recourse Bandit ($\textsf{HR-Bandit}$), which integrates human expertise to enhance performance. $\textsf{HR-Bandit}$ offers three key guarantees: (i) a warm-start guarantee for improved initial performance, (ii) a human-effort guarantee to minimize required human interactions, and (iii) a robustness guarantee that ensures sublinear regret even when human decisions are suboptimal. Empirical results, including a healthcare case study, validate its superior performance against existing benchmarks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes