LGAug 21, 2024
Time Series Foundation Models and Deep Learning Architectures for Earthquake Temporal and Spatial NowcastingAlireza Jafari, Geoffrey Fox, John B. Rundle et al.
Advancing the capabilities of earthquake nowcasting, the real-time forecasting of seismic activities remains a crucial and enduring objective aimed at reducing casualties. This multifaceted challenge has recently gained attention within the deep learning domain, facilitated by the availability of extensive, long-term earthquake datasets. Despite significant advancements, existing literature on earthquake nowcasting lacks comprehensive evaluations of pre-trained foundation models and modern deep learning architectures. These architectures, such as transformers or graph neural networks, uniquely focus on different aspects of data, including spatial relationships, temporal patterns, and multi-scale dependencies. This paper addresses the mentioned gap by analyzing different architectures and introducing two innovation approaches called MultiFoundationQuake and GNNCoder. We formulate earthquake nowcasting as a time series forecasting problem for the next 14 days within 0.1-degree spatial bins in Southern California, spanning from 1986 to 2024. Earthquake time series is forecasted as a function of logarithm energy released by quakes. Our comprehensive evaluation employs several key performance metrics, notably Nash-Sutcliffe Efficiency and Mean Squared Error, over time in each spatial region. The results demonstrate that our introduced models outperform other custom architectures by effectively capturing temporal-spatial relationships inherent in seismic data. The performance of existing foundation models varies significantly based on the pre-training datasets, emphasizing the need for careful dataset selection. However, we introduce a new general approach termed MultiFoundationPattern that combines a bespoke pattern with foundation model results handled as auxiliary streams. In the earthquake case, the resultant MultiFoundationQuake model achieves the best overall performance.
STDec 2, 2022
NETpred: Network-based modeling and prediction of multiple connected market indicesAlireza Jafari, Saman Haratizadeh
Market prediction plays a major role in supporting financial decisions. An emerging approach in this domain is to use graphical modeling and analysis to for prediction of next market index fluctuations. One important question in this domain is how to construct an appropriate graphical model of the data that can be effectively used by a semi-supervised GNN to predict index fluctuations. In this paper, we introduce a framework called NETpred that generates a novel heterogeneous graph representing multiple related indices and their stocks by using several stock-stock and stock-index relation measures. It then thoroughly selects a diverse set of representative nodes that cover different parts of the state space and whose price movements are accurately predictable. By assigning initial predicted labels to such a set of nodes, NETpred makes sure that the subsequent GCN model can be successfully trained using a semi-supervised learning process. The resulting model is then used to predict the stock labels which are finally aggregated to infer the labels for all the index nodes in the graph. Our comprehensive set of experiments shows that NETpred improves the performance of the state-of-the-art baselines by 3%-5% in terms of F-score measure on different well-known data sets.
LGOct 19, 2024Code
Deep Learning Foundation and Pattern Models: Challenges in Hydrological Time SeriesJunyang He, Ying-Jung Chen, Alireza Jafari et al.
There has been active investigation into deep learning approaches for time series analysis, including foundation models. However, most studies do not address significant scientific applications. This paper aims to identify key features in time series by examining hydrology data. Our work advances computer science by emphasizing critical application features and contributes to hydrology and other scientific fields by identifying modeling approaches that effectively capture these features. Scientific time series data are inherently complex, involving observations from multiple locations, each with various time-dependent data streams and exogenous factors that may be static or time-varying and either application-dependent or purely mathematical. This research analyzes hydrology time series from the CAMELS and Caravan global datasets, which encompass rainfall and runoff data across catchments, featuring up to six observed streams and 209 static parameters across approximately 8,000 locations. Our investigation assesses the impact of exogenous data through eight different model configurations for key hydrology tasks. Results demonstrate that integrating exogenous information enhances data representation, reducing mean squared error by up to 40% in the largest dataset. Additionally, we present a detailed performance comparison of over 20 state-of-the-art pattern and foundation models. The analysis is fully open-source, facilitated by Jupyter Notebook on Google Colab for LSTM-based modeling, data preprocessing, and model comparisons. Preliminary findings using alternative deep learning architectures reveal that models incorporating comprehensive observed and exogenous data outperform more limited approaches, including foundation models. Notably, natural annual periodic exogenous time series contribute the most significant improvements, though static and other periodic factors are also valuable.
ROApr 15
Empirical Prediction of Pedestrian Comfort in Mobile Robot Pedestrian EncountersAlireza Jafari, Hong-Son Nguyen, Yen-Chen Liu
Mobile robots joining public spaces like sidewalks must care for pedestrian comfort. Many studies consider pedestrians' objective safety, for example, by developing collision avoidance algorithms, but not enough studies take the pedestrian's subjective safety or comfort into consideration. Quantifying comfort is a major challenge that hinders mobile robots from understanding and responding to human emotions. We empirically look into the relationship between the mobile robot-pedestrian interaction kinematics and subjective comfort. We perform one-on-one experimental trials, each involving a mobile robot and a volunteer. Statistical analysis of pedestrians' reported comfort versus the kinematic variables shows moderate but significant correlations for most variables. Based on these empirical findings, we design three comfort estimators/predictors derived from the minimum distance, the minimum projected time-to-collision, and a composite estimator. The composite estimator employs all studied kinematic variables and reaches the highest prediction rate and classifying performance among the predictors. The composite predictor has an odds ratio of 3.67. In simple terms, when it identifies a pedestrian as comfortable, it is almost 4 times more likely that the pedestrian is comfortable rather than uncomfortable. The study provides a comfort quantifier for incorporating pedestrian feelings into path planners for more socially compliant robots.
LGNov 14, 2025
Leveraging Exogenous Signals for Hydrology Time Series ForecastingJunyang He, Judy Fox, Alireza Jafari et al.
Recent advances in time series research facilitate the development of foundation models. While many state-of-the-art time series foundation models have been introduced, few studies examine their effectiveness in specific downstream applications in physical science. This work investigates the role of integrating domain knowledge into time series models for hydrological rainfall-runoff modeling. Using the CAMELS-US dataset, which includes rainfall and runoff data from 671 locations with six time series streams and 30 static features, we compare baseline and foundation models. Results demonstrate that models incorporating comprehensive known exogenous inputs outperform more limited approaches, including foundation models. Notably, incorporating natural annual periodic time series contribute the most significant improvements.
LGMay 30, 2025
DeepBoost-AF: A Novel Unsupervised Feature Learning and Gradient Boosting Fusion for Robust Atrial Fibrillation Detection in Raw ECG SignalsAlireza Jafari, Fereshteh Yousefirizi, Vahid Seydi
Atrial fibrillation (AF) is a prevalent cardiac arrhythmia associated with elevated health risks, where timely detection is pivotal for mitigating stroke-related morbidity. This study introduces an innovative hybrid methodology integrating unsupervised deep learning and gradient boosting models to improve AF detection. A 19-layer deep convolutional autoencoder (DCAE) is coupled with three boosting classifiers-AdaBoost, XGBoost, and LightGBM (LGBM)-to harness their complementary advantages while addressing individual limitations. The proposed framework uniquely combines DCAE with gradient boosting, enabling end-to-end AF identification devoid of manual feature extraction. The DCAE-LGBM model attains an F1-score of 95.20%, sensitivity of 99.99%, and inference latency of four seconds, outperforming existing methods and aligning with clinical deployment requirements. The DCAE integration significantly enhances boosting models, positioning this hybrid system as a reliable tool for automated AF detection in clinical settings.
TRFeb 19, 2022
GCNET: graph-based prediction of stock price movement using graph convolutional networkAlireza Jafari, Saman Haratizadeh
The importance of considering related stocks data for the prediction of stock price movement has been shown in many studies, however, advanced graphical techniques for modeling, embedding and analyzing the behavior of interrelated stocks have not been widely exploited for the prediction of stocks price movements yet. The main challenges in this domain are to find a way for modeling the existing relations among an arbitrary set of stocks and to exploit such a model for improving the prediction performance for those stocks. The most of existing methods in this domain rely on basic graph-analysis techniques, with limited prediction power, and suffer from a lack of generality and flexibility. In this paper, we introduce a novel framework, called GCNET that models the relations among an arbitrary set of stocks as a graph structure called influence network and uses a set of history-based prediction models to infer plausible initial labels for a subset of the stock nodes in the graph. Finally, GCNET uses the Graph Convolutional Network algorithm to analyze this partially labeled graph and predicts the next price direction of movement for each stock in the graph. GCNET is a general prediction framework that can be applied for the prediction of the price fluctuations of interacting stocks based on their historical data. Our experiments and evaluations on a set of stocks from the NASDAQ index demonstrate that GCNET significantly improves the performance of SOTA in terms of accuracy and MCC measures.