ε-Neural Thompson Sampling of Deep Brain Stimulation for Parkinson Disease Treatment
This work addresses the challenge of real-time adaptive deep brain stimulation for Parkinson's disease patients, offering a more sample-efficient alternative to reinforcement learning methods, though it is incremental as it builds on existing CMAB approaches.
The paper tackles the problem of energy inefficiency and side effects in traditional continuous deep brain stimulation (cDBS) for Parkinson's disease by proposing an adaptive DBS (aDBS) method using contextual multi-armed bandits, specifically an ε-Neural Thompson sampling algorithm, which outperforms existing cDBS and CMAB baselines in a computational model.
Deep Brain Stimulation (DBS) stands as an effective intervention for alleviating the motor symptoms of Parkinson's disease (PD). Traditional commercial DBS devices are only able to deliver fixed-frequency periodic pulses to the basal ganglia (BG) regions of the brain, i.e., continuous DBS (cDBS). However, they in general suffer from energy inefficiency and side effects, such as speech impairment. Recent research has focused on adaptive DBS (aDBS) to resolve the limitations of cDBS. Specifically, reinforcement learning (RL) based approaches have been developed to adapt the frequencies of the stimuli in order to achieve both energy efficiency and treatment efficacy. However, RL approaches in general require significant amount of training data and computational resources, making it intractable to integrate RL policies into real-time embedded systems as needed in aDBS. In contrast, contextual multi-armed bandits (CMAB) in general lead to better sample efficiency compared to RL. In this study, we propose a CMAB solution for aDBS. Specifically, we define the context as the signals capturing irregular neuronal firing activities in the BG regions (i.e., beta-band power spectral density), while each arm signifies the (discretized) pulse frequency of the stimulation. Moreover, an ε-exploring strategy is introduced on top of the classic Thompson sampling method, leading to an algorithm called ε-Neural Thompson sampling (ε-NeuralTS), such that the learned CMAB policy can better balance exploration and exploitation of the BG environment. The ε-NeuralTS algorithm is evaluated using a computation BG model that captures the neuronal activities in PD patients' brains. The results show that our method outperforms both existing cDBS methods and CMAB baselines.