LG MAJan 7, 2021

Active Screening for Recurrent Diseases: A Reinforcement Learning Approach

Han-Ching Ou, Haipeng Chen, Shahin Jabbari, Milind Tambe

arXiv:2101.02766v38.49 citations

Originality Highly original

AI Analysis

This work is significant for public health officials and policymakers who need to optimize resource allocation for disease screening, offering a method that scales to longer time horizons than previous approaches.

This paper addresses the challenge of active screening for recurrent infectious diseases by proposing a novel reinforcement learning (RL) approach. The goal is to minimize infections over a long time horizon, and the method tackles the combinatorial, sequential planning, and uncertainty challenges inherent in the problem. The authors evaluate their RL algorithm on several real-world networks.

Active screening is a common approach in controlling the spread of recurring infectious diseases such as tuberculosis and influenza. In this approach, health workers periodically select a subset of population for screening. However, given the limited number of health workers, only a small subset of the population can be visited in any given time period. Given the recurrent nature of the disease and rapid spreading, the goal is to minimize the number of infections over a long time horizon. Active screening can be formalized as a sequential combinatorial optimization over the network of people and their connections. The main computational challenges in this formalization arise from i) the combinatorial nature of the problem, ii) the need of sequential planning and iii) the uncertainties in the infectiousness states of the population. Previous works on active screening fail to scale to large time horizon while fully considering the future effect of current interventions. In this paper, we propose a novel reinforcement learning (RL) approach based on Deep Q-Networks (DQN), with several innovative adaptations that are designed to address the above challenges. First, we use graph convolutional networks (GCNs) to represent the Q-function that exploit the node correlations of the underlying contact network. Second, to avoid solving a combinatorial optimization problem in each time period, we decompose the node set selection as a sub-sequence of decisions, and further design a two-level RL framework that solves the problem in a hierarchical way. Finally, to speed-up the slow convergence of RL which arises from reward sparseness, we incorporate ideas from curriculum learning into our hierarchical RL approach. We evaluate our RL algorithm on several real-world networks.

View on arXiv PDF

Similar