Bandit Models of Human Behavior: Reward Processing in Mental Disorders
This work addresses the need for computational models to understand reward processing abnormalities in neurological and psychiatric conditions like Parkinson's and Alzheimer's diseases, ADHD, addiction, and chronic pain, but it is incremental as it builds on existing Thompson Sampling methods.
The authors tackled the problem of modeling human decision-making in mental disorders by proposing a parametric multi-armed bandit framework that extends Thompson Sampling to incorporate reward processing biases, and they demonstrated that it often outperforms the baseline on various datasets.
Drawing an inspiration from behavioral studies of human decision making, we propose here a general parametric framework for multi-armed bandit problem, which extends the standard Thompson Sampling approach to incorporate reward processing biases associated with several neurological and psychiatric conditions, including Parkinson's and Alzheimer's diseases, attention-deficit/hyperactivity disorder (ADHD), addiction, and chronic pain. We demonstrate empirically that the proposed parametric approach can often outperform the baseline Thompson Sampling on a variety of datasets. Moreover, from the behavioral modeling perspective, our parametric framework can be viewed as a first step towards a unifying computational model capturing reward processing abnormalities across multiple mental conditions.