Heman Shakeri

Semantic Scholar Profile

h-index12

16papers

29citations

Novelty46%

AI Score52

Ranked #33,914 of 201,326 authors (top 17%)#7,832 in LG (top 18%)

16 Papers

33.5DCApr 23Code

FlashSpread: IO-Aware GPU Simulation of Non-Markovian Epidemic Dynamics via Kernel Fusion

Heman Shakeri, Behnaz Moradi-Jamei, Aram Vajdi et al.

Non-Markovian (renewal) epidemic simulation on multi-million-node contact networks is essential for realistic forecasting under general age-dependent holding-time distributions (log-normal, Weibull, Erlang, and similar), but the age-dependent hazard forces dense per-step updates that render the sparse event-queue strategies of standard CPU methods ineffective. We present FlashSpread, a GPU framework that consolidates the per-step renewal pipeline (CSR traversal, numerically stable erfcx-based hazard evaluation, Bernoulli tau-leaping, state transition, and next-step infectivity write-back) into a single fused Triton kernel whose intermediates never leave streaming-multiprocessor registers, with block-scalar skips that preserve CUDA Graph capture and a degree-aware CSR dispatch (thread / warp / edge-merge) that keeps the peak throughput on scale-free graphs. On an NVIDIA A100 the fused CUDA-Graph engine reaches 8.09 Giga-NUPS at N = 10^6 on a uniform-degree graph, a 217x strict hardware speedup over optimised CPU tau-leaping at the same N; on a Barabasi-Albert graph of the same size the merge-based dispatch recovers 4.5x (0.45 to 2.0 Giga-NUPS) over the default kernel, and the framework scales to N = 10^8 on a single A100 (40 GB), with a mixed-precision storage path that extends the L2-reachable scale by roughly 3x and delivers a 1.15x throughput lift at the far bandwidth-bound end. Validation against an exact non-Markovian Gillespie reference shows a structural-bias floor of approximately 6% on peak infection and approximately 7% on final attack rate that does not detectably decrease as epsilon nears 0 across two decades of tolerance, comfortably within typical epidemiological parameter uncertainty. Code: https://github.com/Shakeri-Lab/FlashSpread.

LGApr 16, 2023Code

An Interpretable Approach to Load Profile Forecasting in Power Grids using Galerkin-Approximated Koopman Pseudospectra

Ali Tavasoli, Behnaz Moradijamei, Heman Shakeri

This paper presents an interpretable machine learning approach that characterizes load dynamics within an operator-theoretic framework for electricity load forecasting in power grids. We represent the dynamics of load data using the Koopman operator, which provides a linear, infinite-dimensional representation of the nonlinear dynamics, and approximate a finite version that remains robust against spectral pollutions due to truncation. By computing $ε$-approximate Koopman eigenfunctions using dynamics-adapted kernels in delay coordinates, we decompose the load dynamics into coherent spatiotemporal patterns that evolve quasi-independently. Our approach captures temporal coherent patterns due to seasonal changes and finer time scales, such as time of day and day of the week. This method allows for a more nuanced understanding of the complex interactions within power grids and their response to various exogenous factors. We assess our method using a large-scale dataset from a renewable power system in the continental European electricity system. The results indicate that our Koopman-based method surpasses a separately optimized deep learning (LSTM) architecture in both accuracy and computational efficiency, while providing deeper insights into the underlying dynamics of the power grid\footnote{The code is available at \href{https://github.com/Shakeri-Lab/Power-Grids}{github.com/Shakeri-Lab/Power-Grids}.

LGMay 2, 2022

Using Machine Learning to Evaluate Real Estate Prices Using Location Big Data

Walter Coleman, Ben Johann, Nicholas Pasternak et al.

With everyone trying to enter the real estate market nowadays, knowing the proper valuations for residential and commercial properties has become crucial. Past researchers have been known to utilize static real estate data (e.g. number of beds, baths, square footage) or even a combination of real estate and demographic information to predict property prices. In this investigation, we attempted to improve upon past research. So we decided to explore a unique approach: we wanted to determine if mobile location data could be used to improve the predictive power of popular regression and tree-based models. To prepare our data for our models, we processed the mobility data by attaching it to individual properties from the real estate data that aggregated users within 500 meters of the property for each day of the week. We removed people that lived within 500 meters of each property, so each property's aggregated mobility data only contained non-resident census features. On top of these dynamic census features, we also included static census features, including the number of people in the area, the average proportion of people commuting, and the number of residents in the area. Finally, we tested multiple models to predict real estate prices. Our proposed model is two stacked random forest modules combined using a ridge regression that uses the random forest outputs as predictors. The first random forest model used static features only and the second random forest model used dynamic features only. Comparing our models with and without the dynamic mobile location features concludes the model with dynamic mobile location features achieves 3/% percent lower mean squared error than the same model but without dynamic mobile location features.

LGDec 17, 2022

Leveraging Wastewater Monitoring for COVID-19 Forecasting in the US: a Deep Learning study

Mehrdad Fazli, Heman Shakeri

The outburst of COVID-19 in late 2019 was the start of a health crisis that shook the world and took millions of lives in the ensuing years. Many governments and health officials failed to arrest the rapid circulation of infection in their communities. The long incubation period and the large proportion of asymptomatic cases made COVID-19 particularly elusive to track. However, wastewater monitoring soon became a promising data source in addition to conventional indicators such as confirmed daily cases, hospitalizations, and deaths. Despite the consensus on the effectiveness of wastewater viral load data, there is a lack of methodological approaches that leverage viral load to improve COVID-19 forecasting. This paper proposes using deep learning to automatically discover the relationship between daily confirmed cases and viral load data. We trained one Deep Temporal Convolutional Networks (DeepTCN) and one Temporal Fusion Transformer (TFT) model to build a global forecasting model. We supplement the daily confirmed cases with viral loads and other socio-economic factors as covariates to the models. Our results suggest that TFT outperforms DeepTCN and learns a better association between viral load and daily cases. We demonstrated that equipping the models with the viral load improves their forecasting performance significantly. Moreover, viral load is shown to be the second most predictive input, following the containment and health index. Our results reveal the feasibility of training a location-agnostic deep-learning model to capture the dynamics of infection diffusion when wastewater viral load data is provided.

LGDec 10, 2025

Mitigating Exposure Bias in Risk-Aware Time Series Forecasting with Soft Tokens

Alireza Namazi, Amirreza Dolatpour Fathkouhi, Heman Shakeri

Autoregressive forecasting is central to predictive control in diabetes and hemodynamic management, where different operating zones carry different clinical risks. Standard models trained with teacher forcing suffer from exposure bias, yielding unstable multi-step forecasts for closed-loop use. We introduce Soft-Token Trajectory Forecasting (SoTra), which propagates continuous probability distributions (``soft tokens'') to mitigate exposure bias and learn calibrated, uncertainty-aware trajectories. A risk-aware decoding module then minimizes expected clinical harm. In glucose forecasting, SoTra reduces average zone-based risk by 18\%; in blood-pressure forecasting, it lowers effective clinical risk by approximately 15\%. These improvements support its use in safety-critical predictive control.

LGFeb 17

The Stationarity Bias: Stratified Stress-Testing for Time-Series Imputation in Regulated Dynamical Systems

Amirreza Dolatpour Fathkouhi, Alireza Namazi, Heman Shakeri

Time-series imputation benchmarks employ uniform random masking and shape-agnostic metrics (MSE, RMSE), implicitly weighting evaluation by regime prevalence. In systems with a dominant attractor -- homeostatic physiology, nominal industrial operation, stable network traffic -- this creates a systematic \emph{Stationarity Bias}: simple methods appear superior because the benchmark predominantly samples the easy, low-entropy regime where they trivially succeed. We formalize this bias and propose a \emph{Stratified Stress-Test} that partitions evaluation into Stationary and Transient regimes. Using Continuous Glucose Monitoring (CGM) as a testbed -- chosen for its rigorous ground-truth forcing functions (meals, insulin) that enable precise regime identification -- we establish three findings with broad implications:(i)~Stationary Efficiency: Linear interpolation achieves state-of-the-art reconstruction during stable intervals, confirming that complex architectures are computationally wasteful in low-entropy regimes.(ii)~Transient Fidelity: During critical transients (post-prandial peaks, hypoglycemic events), linear methods exhibit drastically degraded morphological fidelity (DTW), disproportionate to their RMSE -- a phenomenon we term the \emph{RMSE Mirage}, where low pointwise error masks the destruction of signal shape.(iii)~Regime-Conditional Model Selection: Deep learning models preserve both pointwise accuracy and morphological integrity during transients, making them essential for safety-critical downstream tasks. We further derive empirical missingness distributions from clinical trials and impose them on complete training data, preventing models from exploiting unrealistically clean observations and encouraging robustness under real-world missingness. This framework generalizes to any regulated system where routine stationarity dominates critical transients.

AOSep 7, 2023

Operator-Based Detecting, Learning, and Stabilizing Unstable Periodic Orbits of Chaotic Attractors

Ali Tavasoli, Heman Shakeri

This paper examines the use of operator-theoretic approaches to the analysis of chaotic systems through the lens of their unstable periodic orbits (UPOs). Our approach involves three data-driven steps for detecting, identifying, and stabilizing UPOs. We demonstrate the use of kernel integral operators within delay coordinates as an innovative method for UPO detection. For identifying the dynamic behavior associated with each individual UPO, we utilize the Koopman operator to present the dynamics as linear equations in the space of Koopman eigenfunctions. This allows for characterizing the chaotic attractor by investigating its principal dynamical modes across varying UPOs. We extend this methodology into an interpretable machine learning framework aimed at stabilizing strange attractors on their UPOs. To illustrate the efficacy of our approach, we apply it to the Lorenz attractor as a case study.

7.3LGMay 1

Deep Kernel Learning for Stratifying Glaucoma Trajectories

Bruce Rushing, Angela Danquah, Alireza Namazi et al.

Effectively stratifying patient risk in chronic diseases like glaucoma is a major clinical challenge. Clinicians need tools to identify patients at high risk of progression from sparse and irregularly-sampled electronic health records (EHRs). We propose a novel deep kernel learning (DKL) architecture that leverages a Gaussian Process (GP) backend. The GP's kernel is defined by a transformer-based feature extractor applied to clinical-BERT embeddings to model glaucoma patient trajectories from multimodal EHR data. Our method successfully identifies three clinically distinct patient subgroups. Crucially, the model learns to decouple disease progression from current severity, identifying a high-risk group with a worsening trajectory despite having better average visual acuity than a second, stably poor group. This reveals that the model learns to identify progression risk rather than just the current disease state. This ability to stratify patients based on their risk trajectory progression offers a powerful tool for clinical decision support, enabling targeted interventions for high-risk individuals and improving the management of glaucoma care.

49.2LGMay 1

From Prediction to Practice: A Task-Aware Evaluation Framework for Blood Glucose Forecasting

Alireza Namazi, Heman Shakeri

Clinical time-series forecasting is increasingly studied for decision support, yet standard aggregate metrics can obscure whether a model is actually useful for the task it is meant to serve. In safety-critical settings, low average error can coexist with dangerous failures in exactly the high-risk regimes that matter most. We present a task-aware evaluation framework for blood glucose forecasting built around two downstream uses: hypoglycemia early warning and insulin dosing decision support. For early warning, we evaluate on real data from three clinical cohorts using event-level recall and false alarms per patient-day, metrics that reflect operational alarm burden rather than aggregate accuracy. We show that models appearing acceptable overall, with recall above 0.9 on the full test set, can fail badly in the post-bolus slice, where insulin-on-board is elevated and missed warnings carry the greatest clinical consequences. Standard forecasting evaluation, however, does not test whether a model can reason about the effects of actions, a requirement for supporting insulin dosing decisions. We therefore add a second, interventional arm using the FDA-accepted UVA/Padova simulator, where we evaluate whether forecasters can predict glucose responses to altered insulin plans in paired factual/counterfactual scenarios. We show that models that look strong on real-data forecasting often fail to predict the direction, magnitude, or ranking of intervention effects, and choose poor insulin doses when evaluated under a clinically motivated cost. Taken together, the two arms reveal a consistent gap between forecasting accuracy and task-relevant usefulness. We release the benchmark, the standardized preprocessing pipeline for public cohorts, and the simulator-based interventional dataset as a reproducible toolkit.

LGAug 6, 2025

Multi-Marginal Stochastic Flow Matching for High-Dimensional Snapshot Data at Irregular Time Points

Justin Lee, Behnaz Moradijamei, Heman Shakeri

Modeling the evolution of high-dimensional systems from limited snapshot observations at irregular time points poses a significant challenge in quantitative biology and related fields. Traditional approaches often rely on dimensionality reduction techniques, which can oversimplify the dynamics and fail to capture critical transient behaviors in non-equilibrium systems. We present Multi-Marginal Stochastic Flow Matching (MMSFM), a novel extension of simulation-free score and flow matching methods to the multi-marginal setting, enabling the alignment of high-dimensional data measured at non-equidistant time points without reducing dimensionality. The use of measure-valued splines enhances robustness to irregular snapshot timing, and score matching prevents overfitting in high-dimensional spaces. We validate our framework on several synthetic and benchmark datasets, including gene expression data collected at uneven time points and an image progression task, demonstrating the method's versatility.

LGJun 29, 2025

Online Meal Detection Based on CGM Data Dynamics

Ali Tavasoli, Heman Shakeri

We utilize dynamical modes as features derived from Continuous Glucose Monitoring (CGM) data to detect meal events. By leveraging the inherent properties of underlying dynamics, these modes capture key aspects of glucose variability, enabling the identification of patterns and anomalies associated with meal consumption. This approach not only improves the accuracy of meal detection but also enhances the interpretability of the underlying glucose dynamics. By focusing on dynamical features, our method provides a robust framework for feature extraction, facilitating generalization across diverse datasets and ensuring reliable performance in real-world applications. The proposed technique offers significant advantages over traditional approaches, improving detection accuracy,

CVNov 18, 2025

Fusing Biomechanical and Spatio-Temporal Features for Fall Prediction: Characterizing and Mitigating the Simulation-to-Reality Gap

Md Fokhrul Islam, Sajeda Al-Hammouri, Christopher J. Arellano et al.

Falls are a leading cause of injury and loss of independence among older adults. Vision-based fall prediction systems offer a non-invasive solution to anticipate falls seconds before impact, but their development is hindered by the scarcity of available fall data. Contributing to these efforts, this study proposes the Biomechanical Spatio-Temporal Graph Convolutional Network (BioST-GCN), a dual-stream model that combines both pose and biomechanical information using a cross-attention fusion mechanism. Our model outperforms the vanilla ST-GCN baseline by 5.32% and 2.91% F1-score on the simulated MCF-UA stunt-actor and MUVIM datasets, respectively. The spatio-temporal attention mechanisms in the ST-GCN stream also provide interpretability by identifying critical joints and temporal phases. However, a critical simulation-reality gap persists. While our model achieves an 89.0% F1-score with full supervision on simulated data, zero-shot generalization to unseen subjects drops to 35.9%. This performance decline is likely due to biases in simulated data, such as `intent-to-fall' cues. For older adults, particularly those with diabetes or frailty, this gap is exacerbated by their unique kinematic profiles. To address this, we propose personalization strategies and advocate for privacy-preserving data pipelines to enable real-world validation. Our findings underscore the urgent need to bridge the gap between simulated and real-world data to develop effective fall prediction systems for vulnerable elderly populations.

LGNov 25, 2025

The Driver-Blindness Phenomenon: Why Deep Sequence Models Default to Autocorrelation in Blood Glucose Forecasting

Heman Shakeri

Deep sequence models for blood glucose forecasting consistently fail to leverage clinically informative drivers--insulin, meals, and activity--despite well-understood physiological mechanisms. We term this Driver-Blindness and formalize it via $Δ_{\text{drivers}}$, the performance gain of multivariate models over matched univariate baselines. Across the literature, $Δ_{\text{drivers}}$ is typically near zero. We attribute this to three interacting factors: architectural biases favoring autocorrelation (C1), data fidelity gaps that render drivers noisy and confounded (C2), and physiological heterogeneity that undermines population-level models (C3). We synthesize strategies that partially mitigate Driver-Blindness--including physiological feature encoders, causal regularization, and personalization--and recommend that future work routinely report $Δ_{\text{drivers}}$ to prevent driver-blind models from being considered state-of-the-art.

CYNov 25, 2025

The Metaphysics We Train: A Heideggerian Reading of Machine Learning

Heman Shakeri

This paper offers a phenomenological reading of contemporary machine learning through Heideggerian concepts, aimed at enriching practitioners' reflexive understanding of their own practice. We argue that this philosophical lens reveals three insights invisible to purely technical analysis. First, the algorithmic Entwurf (projection) is distinctive in being automated, opaque, and emergent--a metaphysics that operates without explicit articulation or debate, crystallizing implicitly through gradient descent rather than theoretical argument. Second, even sophisticated technical advances remain within the regime of Gestell (Enframing), improving calculation without questioning the primacy of calculation itself. Third, AI's lack of existential structure, specifically the absence of Care (Sorge), is genuinely explanatory: it illuminates why AI systems have no internal resources for questioning their own optimization imperatives, and why they optimize without the anxiety (Angst) that signals, in human agents, the friction between calculative absorption and authentic existence. We conclude by exploring the pedagogical value of this perspective, arguing that data science education should cultivate not only technical competence but ontological literacy--the capacity to recognize what worldviews our tools enact and when calculation itself may be the wrong mode of engagement.

SIAug 1, 2021

A purely data-driven framework for prediction, optimization, and control of networked processes: application to networked SIS epidemic model

Ali Tavasoli, Teague Henry, Heman Shakeri

Networks are landmarks of many complex phenomena where interweaving interactions between different agents transform simple local rule-sets into nonlinear emergent behaviors. While some recent studies unveil associations between the network structure and the underlying dynamical process, identifying stochastic nonlinear dynamical processes continues to be an outstanding problem. Here we develop a simple data-driven framework based on operator-theoretic techniques to identify and control stochastic nonlinear dynamics taking place over large-scale networks. The proposed approach requires no prior knowledge of the network structure and identifies the underlying dynamics solely using a collection of two-step snapshots of the states. This data-driven system identification is achieved by using the Koopman operator to find a low dimensional representation of the dynamical patterns that evolve linearly. Further, we use the global linear Koopman model to solve critical control problems by applying to model predictive control (MPC)--typically, a challenging proposition when applied to large networks. We show that our proposed approach tackles this by converting the original nonlinear programming into a more tractable optimization problem that is both convex and with far fewer variables.

SIOct 2, 2019

A new method for quantifying network cyclic structure to improve community detection

Behnaz Moradi-Jamei, Heman Shakeri, Pietro Poggi-Corradini et al.

A distinguishing property of communities in networks is that cycles are more prevalent within communities than across communities. Thus, the detection of these communities may be aided through the incorporation of measures of the local "richness" of the cyclic structure. In this paper, we introduce renewal non-backtracking random walks (RNBRW) as a way of quantifying this structure. RNBRW gives a weight to each edge equal to the probability that a non-backtracking random walk completes a cycle with that edge. Hence, edges with larger weights may be thought of as more important to the formation of cycles. Of note, since separate random walks can be performed in parallel, RNBRW weights can be estimated very quickly, even for large graphs. We give simulation results showing that pre-weighting edges through RNBRW may substantially improve the performance of common community detection algorithms. Our results suggest that RNBRW is especially efficient for the challenging case of detecting communities in sparse graphs.