NANov 6, 2015
Estimating numerical errors due to operator splitting in global atmospheric chemistry models: Transport and chemistryMauricio Santillana, Ling Zhang, Robert Yantosca
We present upper bounds for the numerical errors introduced when using operator splitting methods to integrate transport and non-linear chemistry processes in global chemical transport models (CTM). We show that (a) operator splitting strategies that evaluate the stiff non-linear chemistry operator at the end of the time step are more accurate, and (b) the results of numerical simulations that use different operator splitting strategies differ by at most 10 percent, in a prototype one-dimensional non-linear chemistry-transport model. We find similar upper bounds in operator splitting numerical errors in global CTM simulations.
OTApr 8, 2020
A machine learning methodology for real-time forecasting of the 2019-2020 COVID-19 outbreak using Internet searches, news alerts, and estimates from mechanistic modelsDianbo Liu, Leonardo Clemente, Canelle Poirier et al.
We present a timely and novel methodology that combines disease estimates from mechanistic models with digital traces, via interpretable machine-learning methodologies, to reliably forecast COVID-19 activity in Chinese provinces in real-time. Specifically, our method is able to produce stable and accurate forecasts 2 days ahead of current time, and uses as inputs (a) official health reports from Chinese Center Disease for Control and Prevention (China CDC), (b) COVID-19-related internet search activity from Baidu, (c) news media activity reported by Media Cloud, and (d) daily forecasts of COVID-19 activity from GLEAM, an agent-based mechanistic model. Our machine-learning methodology uses a clustering technique that enables the exploitation of geo-spatial synchronicities of COVID-19 activity across Chinese provinces, and a data augmentation technique to deal with the small number of historical disease activity observations, characteristic of emerging outbreaks. Our model's predictive power outperforms a collection of baseline models in 27 out of the 32 Chinese provinces, and could be easily extended to other geographies currently affected by the COVID-19 outbreak to help decision makers.
LGNov 6, 2019
Towards the Use of Neural Networks for Influenza Prediction at Multiple Spatial ResolutionsEmily L. Aiken, Andre T. Nguyen, Mauricio Santillana
We introduce the use of a Gated Recurrent Unit (GRU) for influenza prediction at the state- and city-level in the US, and experiment with the inclusion of real-time flu-related Internet search data. We find that a GRU has lower prediction error than current state-of-the-art methods for data-driven influenza prediction at time horizons of over two weeks. In contrast with other machine learning approaches, the inclusion of real-time Internet search data does not improve GRU predictions.
APDec 13, 2015
Cloud-based Electronic Health Records for Real-time, Region-specific Influenza SurveillanceMauricio Santillana, Andre Nguyen, Tamara Louie et al.
Accurate real-time monitoring systems of influenza outbreaks help public health officials make informed decisions that may help save lives. We show that information extracted from cloud-based electronic health records databases, in combination with machine learning techniques and historical epidemiological information, have the potential to accurately and reliably provide near real-time regional predictions of flu outbreaks in the United States.
APMay 5, 2015
Accurate estimation of influenza epidemics using Google search data via ARGOShihao Yang, Mauricio Santillana, S. C. Kou
Accurate real-time tracking of influenza outbreaks helps public health officials make timely and meaningful decisions that could save lives. We propose an influenza tracking model, ARGO (AutoRegression with GOogle search data), that uses publicly available online search data. In addition to having a rigorous statistical foundation, ARGO outperforms all previously available Google-search-based tracking models, including the latest version of Google Flu Trends, even though it uses only low-quality search data as input from publicly available Google Trends and Google Correlate websites. ARGO not only incorporates the seasonality in influenza epidemics but also captures changes in people's online search behavior over time. ARGO is also flexible, self-correcting, robust, and scalable, making it a potentially powerful tool that can be used for real-time tracking of other social events at multiple temporal and spatial resolutions.