Optimising Lockdown Policies for Epidemic Control using Reinforcement Learning
This addresses the challenge of optimizing lockdown policies during pandemics like Covid-19 for policymakers, though it appears incremental as it applies existing reinforcement learning methods to this specific domain.
The paper tackles the problem of balancing health and economic considerations in epidemic control by proposing a reinforcement learning algorithm to compute lockdown policies for individual cities or regions based on disease parameters and population characteristics, showing it as a viable quantitative approach.
In the context of the ongoing Covid-19 pandemic, several reports and studies have attempted to model and predict the spread of the disease. There is also intense debate about policies for limiting the damage, both to health and to the economy. On the one hand, the health and safety of the population is the principal consideration for most countries. On the other hand, we cannot ignore the potential for long-term economic damage caused by strict nation-wide lockdowns. In this working paper, we present a quantitative way to compute lockdown decisions for individual cities or regions, while balancing health and economic considerations. Furthermore, these policies are learnt automatically by the proposed algorithm, as a function of disease parameters (infectiousness, gestation period, duration of symptoms, probability of death) and population characteristics (density, movement propensity). We account for realistic considerations such as imperfect lockdowns, and show that the policy obtained using reinforcement learning is a viable quantitative approach towards lockdowns.