Majd Al Aawar

LG
h-index4
3papers
6citations
Novelty60%
AI Score34

3 Papers

LGJun 17, 2022
Random Forest of Epidemiological Models for Influenza Forecasting

Majd Al Aawar, Ajitesh Srivastava

Forecasting the hospitalizations caused by the Influenza virus is vital for public health planning so that hospitals can be better prepared for an influx of patients. Many forecasting methods have been used in real-time during the Influenza seasons and submitted to the CDC for public communication. The forecasting models range from mechanistic models, and auto-regression models to machine learning models. We hypothesize that we can improve forecasting by using multiple mechanistic models to produce potential trajectories and use machine learning to learn how to combine those trajectories into an improved forecast. We propose a Tree Ensemble model design that utilizes the individual predictors of our baseline model SIkJalpha to improve its performance. Each predictor is generated by changing a set of hyper-parameters. We compare our prospective forecasts deployed for the FluSight challenge (2022) to all the other submitted approaches. Our approach is fully automated and does not require any manual tuning. We demonstrate that our Random Forest-based approach is able to improve upon the forecasts of the individual predictors in terms of mean absolute error, coverage, and weighted interval score. Our method outperforms all other models in terms of the mean absolute error and the weighted interval score based on the mean across all weekly submissions in the current season (2022). Explainability of the Random Forest (through analysis of the trees) enables us to gain insights into how it improves upon the individual predictors.

PEJan 7, 2024
Global Prediction of COVID-19 Variant Emergence Using Dynamics-Informed Graph Neural Networks

Majd Al Aawar, Srikar Mutnuri, Mansooreh Montazerin et al.

During the COVID-19 pandemic, a major driver of new surges has been the emergence of new variants. When a new variant emerges in one or more countries, other nations monitor its spread in preparation for its potential arrival. The impact of the new variant and the timings of epidemic peaks in a country highly depend on when the variant arrives. The current methods for predicting the spread of new variants rely on statistical modeling, however, these methods work only when the new variant has already arrived in the region of interest and has a significant prevalence. Can we predict when a variant existing elsewhere will arrive in a given region? To address this question, we propose a variant-dynamics-informed Graph Neural Network (GNN) approach. First, we derive the dynamics of variant prevalence across pairs of regions (countries) that apply to a large class of epidemic models. The dynamics motivate the introduction of certain features in the GNN. We demonstrate that our proposed dynamics-informed GNN outperforms all the baselines, including the currently pervasive framework of Physics-Informed Neural Networks (PINNs). To advance research in this area, we introduce a benchmarking tool to assess a user-defined model's prediction performance across 87 countries and 36 variants.

LGJun 9, 2025
Sparse Interpretable Deep Learning with LIES Networks for Symbolic Regression

Mansooreh Montazerin, Majd Al Aawar, Antonio Ortega et al.

Symbolic regression (SR) aims to discover closed-form mathematical expressions that accurately describe data, offering interpretability and analytical insight beyond standard black-box models. Existing SR methods often rely on population-based search or autoregressive modeling, which struggle with scalability and symbolic consistency. We introduce LIES (Logarithm, Identity, Exponential, Sine), a fixed neural network architecture with interpretable primitive activations that are optimized to model symbolic expressions. We develop a framework to extract compact formulae from LIES networks by training with an appropriate oversampling strategy and a tailored loss function to promote sparsity and to prevent gradient instability. After training, it applies additional pruning strategies to further simplify the learned expressions into compact formulae. Our experiments on SR benchmarks show that the LIES framework consistently produces sparse and accurate symbolic formulae outperforming all baselines. We also demonstrate the importance of each design component through ablation studies.