LG SOC-PH MLMay 15, 2020

A Deep Q-learning/genetic Algorithms Based Novel Methodology For Optimizing Covid-19 Pandemic Government Actions

Luis Miralles-Pechuán, Fernando Jiménez, Hiram Ponce, Lourdes Martínez-Villaseñor

arXiv:2005.07656v11.2

Originality Incremental advance

AI Analysis

This work addresses the challenge for governments in making data-driven decisions during pandemics, though it appears incremental by combining existing methods like SEIR models with reinforcement learning and evolutionary algorithms.

The paper tackles the problem of optimizing government actions during the COVID-19 pandemic by proposing a methodology using Deep Q-Learning and Genetic Algorithms to sequence actions like confinement and self-isolation, aiming to balance public health and economic impacts; experiments show the Deep Q-Learning approach outperforms Genetic Algorithms in optimizing these sequences.

Whenever countries are threatened by a pandemic, as is the case with the COVID-19 virus, governments should take the right actions to safeguard public health as well as to mitigate the negative effects on the economy. In this regard, there are two completely different approaches governments can take: a restrictive one, in which drastic measures such as self-isolation can seriously damage the economy, and a more liberal one, where more relaxed restrictions may put at risk a high percentage of the population. The optimal approach could be somewhere in between, and, in order to make the right decisions, it is necessary to accurately estimate the future effects of taking one or other measures. In this paper, we use the SEIR epidemiological model (Susceptible - Exposed - Infected - Recovered) for infectious diseases to represent the evolution of the virus COVID-19 over time in the population. To optimize the best sequences of actions governments can take, we propose a methodology with two approaches, one based on Deep Q-Learning and another one based on Genetic Algorithms. The sequences of actions (confinement, self-isolation, two-meter distance or not taking restrictions) are evaluated according to a reward system focused on meeting two objectives: firstly, getting few people infected so that hospitals are not overwhelmed with critical patients, and secondly, avoiding taking drastic measures for too long which can potentially cause serious damage to the economy. The conducted experiments prove that our methodology is a valid tool to discover actions governments can take to reduce the negative effects of a pandemic in both senses. We also prove that the approach based on Deep Q-Learning overcomes the one based on Genetic Algorithms for optimizing the sequences of actions.

View on arXiv PDF

Similar