Survey Bandits with Regret Guarantees
This addresses the issue of high data acquisition costs in applications like healthcare, though it appears incremental as it builds on standard contextual bandits.
The paper tackles the problem of costly feature collection in contextual bandits, such as in healthcare, by proposing algorithms that reduce feature collection while maintaining strong regret guarantees.
We consider a variant of the contextual bandit problem. In standard contextual bandits, when a user arrives we get the user's complete feature vector and then assign a treatment (arm) to that user. In a number of applications (like healthcare), collecting features from users can be costly. To address this issue, we propose algorithms that avoid needless feature collection while maintaining strong regret guarantees.