Jimeng Shi

LG
h-index37
18papers
159citations
Novelty44%
AI Score58

18 Papers

LGSep 9, 2024Code
CoDiCast: Conditional Diffusion Model for Global Weather Prediction with Uncertainty Quantification

Jimeng Shi, Bowen Jin, Jiawei Han et al.

Accurate weather forecasting is critical for science and society. Yet, existing methods have not managed to simultaneously have the properties of high accuracy, low uncertainty, and high computational efficiency. On one hand, to quantify the uncertainty in weather predictions, the strategy of ensemble forecast (i.e., generating a set of diverse predictions) is often employed. However, traditional ensemble numerical weather prediction (NWP) is computationally intensive. On the other hand, most existing machine learning-based weather prediction (MLWP) approaches are efficient and accurate. Nevertheless, they are deterministic and cannot capture the uncertainty of weather forecasting. In this work, we propose CoDiCast, a conditional diffusion model to generate accurate global weather prediction, while achieving uncertainty quantification with ensemble forecasts and modest computational cost. The key idea is to simulate a conditional version of the reverse denoising process in diffusion models, which starts from pure Gaussian noise to generate realistic weather scenarios for a future time point. Each denoising step is conditioned on observations from the recent past. Ensemble forecasts are achieved by repeatedly sampling from stochastic Gaussian noise to represent uncertainty quantification. CoDiCast is trained on a decade of ERA5 reanalysis data from the European Centre for Medium-Range Weather Forecasts (ECMWF). Experimental results demonstrate that our approach outperforms several existing data-driven methods in accuracy. Our conditional diffusion model, CoDiCast, can generate 6-day global weather forecasts, at 6-hour steps and $5.625^\circ$ latitude-longitude resolution, for over 5 variables, in about 12 minutes on a commodity A100 GPU machine with 80GB memory. The open-souced code is provided at https://github.com/JimengShi/CoDiCast.

LGApr 23, 2022
Time Series Forecasting (TSF) Using Various Deep Learning Models

Jimeng Shi, Mahek Jain, Giri Narasimhan

Time Series Forecasting (TSF) is used to predict the target variables at a future time point based on the learning from previous time points. To keep the problem tractable, learning methods use data from a fixed length window in the past as an explicit input. In this paper, we study how the performance of predictive models change as a function of different look-back window sizes and different amounts of time to predict into the future. We also consider the performance of the recent attention-based Transformer models, which has had good success in the image processing and natural language processing domains. In all, we compare four different deep learning methods (RNN, LSTM, GRU, and Transformer) along with a baseline method. The dataset (hourly) we used is the Beijing Air Quality Dataset from the UCI website, which includes a multivariate time series of many factors measured on an hourly basis for a period of 5 years (2010-14). For each model, we also report on the relationship between the performance and the look-back window sizes and the number of predicted time points into the future. Our experiments suggest that Transformer models have the best performance with the lowest Mean Average Errors (MAE = 14.599, 23.273) and Root Mean Square Errors (RSME = 23.573, 38.131) for most of our single-step and multi-steps predictions. The best size for the look-back window to predict 1 hour into the future appears to be one day, while 2 or 4 days perform the best to predict 3 hours into the future.

LGSep 21, 2024Code
ReFine: Boosting Time Series Prediction of Extreme Events by Reweighting and Fine-tuning

Jimeng Shi, Azam Shirali, Giri Narasimhan

Extreme events are of great importance since they often represent impactive occurrences. For instance, in terms of climate and weather, extreme events might be major storms, floods, extreme heat or cold waves, and more. However, they are often located at the tail of the data distribution. Consequently, accurately predicting these extreme events is challenging due to their rarity and irregularity. Prior studies have also referred to this as the out-of-distribution (OOD) problem, which occurs when the distribution of the test data is substantially different from that used for training. In this work, we propose two strategies, reweighting and fine-tuning, to tackle the challenge. Reweighting is a strategy used to force machine learning models to focus on extreme events, which is achieved by a weighted loss function that assigns greater penalties to the prediction errors for the extreme samples relative to those on the remainder of the data. Unlike previous intuitive reweighting methods based on simple heuristics of data distribution, we employ meta-learning to dynamically optimize these penalty weights. To further boost the performance on extreme samples, we start from the reweighted models and fine-tune them using only rare extreme samples. Through extensive experiments on multiple data sets, we empirically validate that our meta-learning-based reweighting outperforms existing heuristic ones, and the fine-tuning strategy can further increase the model performance. More importantly, these two strategies are model-agnostic, which can be implemented on any type of neural network for time series forecasting. The open-sourced code is available at \url{https://github.com/JimengShi/ReFine}.

LGJun 28, 2023
Deep Learning Models for Flood Predictions in South Florida

Jimeng Shi, Zeda Yin, Rukmangadh Myana et al.

Simulating and predicting the water level/stage in river systems is essential for flood warnings, hydraulic operations, and flood mitigations. Physics-based detailed hydrological and hydraulic computational tools, such as HEC-RAS, MIKE, and SWMM, can be used to simulate a complete watershed and compute the water stage at any point in the river system. However, these physics-based models are computationally intensive, especially for large watersheds and for longer simulations, since they use detailed grid representations of terrain elevation maps of the entire watershed and solve complex partial differential equations (PDEs) for each grid cell. To overcome this problem, we train several deep learning (DL) models for use as surrogate models to rapidly predict the water stage. A portion of the Miami River in South Florida was chosen as a case study for this paper. Extensive experiments show that the performance of various DL models (MLP, RNN, CNN, LSTM, and RCNN) is significantly better than that of the physics-based model, HEC-RAS, even during extreme precipitation conditions (i.e., tropical storms), and with speedups exceeding 500x. To predict the water stages more accurately, our DL models use both measured variables of the river system from the recent past and covariates for which predictions are typically available for the near future.

LGOct 11, 2023
Graph Transformer Network for Flood Forecasting with Heterogeneous Covariates

Jimeng Shi, Vitalii Stebliankin, Zhaonan Wang et al.

Floods can be very destructive causing heavy damage to life, property, and livelihoods. Global climate change and the consequent sea-level rise have increased the occurrence of extreme weather events, resulting in elevated and frequent flood risk. Therefore, accurate and timely flood forecasting in coastal river systems is critical to facilitate good flood management. However, the computational tools currently used are either slow or inaccurate. In this paper, we propose a Flood prediction tool using Graph Transformer Network (FloodGTN) for river systems. More specifically, FloodGTN learns the spatio-temporal dependencies of water levels at different monitoring stations using Graph Neural Networks (GNNs) and an LSTM. It is currently implemented to consider external covariates such as rainfall, tide, and the settings of hydraulic structures (e.g., outflows of dams, gates, pumps, etc.) along the river. We use a Transformer to learn the attention given to external covariates in computing water levels. We apply the FloodGTN tool to data from the South Florida Water Management District, which manages a coastal area prone to frequent storms and hurricanes. Experimental results show that FloodGTN outperforms the physics-based model (HEC-RAS) by achieving higher accuracy with 70% improvement while speeding up run times by at least 500x.

AIMay 13Code
Retrieval is Cheap, Show Me the Code: Executable Multi-Hop Reasoning for Retrieval-Augmented Generation

Jiashuo Sun, Jimeng Shi, Yixuan Xie et al.

Retrieval-Augmented Generation (RAG) has become a standard approach for knowledge-intensive question answering, but existing systems remain brittle on multi-hop questions, where solving the task requires chaining multiple retrieval and reasoning steps. Key challenges are that current methods represent reasoning through free-form natural language, where intermediate states are implicit, retrieval queries can drift from intended entities, and errors are detected by the same model that produces them making self-reflection an unreliable, ungrounded signal. We observe that multi-hop question answering is a typical form of step-by-step computation, and that this structured process aligns closely with how code-specialized language models are trained to operate. Motivated by this, we introduce \pyrag, a framework that reformulates multi-hop RAG as program synthesis and execution. Instead of free-form reasoning trajectories, \pyrag represents the reasoning process as an executable Python program over retrieval and QA tools, exposing intermediate states as variables, producing deterministic feedback through execution, and yielding an inspectable trace of the entire reasoning process. This formulation further enables compiler-grounded self-repair and execution-driven adaptive retrieval without any additional training. Experiments on five QA benchmarks (PopQA, HotpotQA, 2WikiMultihopQA, MuSiQue, and Bamboogle) show that \pyrag consistently outperforms strong baselines under both training-free and RL-trained settings, with especially large gains on compositional multi-hop datasets. Our code, data and models are publicly available at https://github.com/GasolSun36/PyRAG.

LGMay 15, 2024Code
TimeX++: Learning Time-Series Explanations with Information Bottleneck

Zichuan Liu, Tianchun Wang, Jimeng Shi et al.

Explaining deep learning models operating on time series data is crucial in various applications of interest which require interpretable and transparent insights from time series signals. In this work, we investigate this problem from an information theoretic perspective and show that most existing measures of explainability may suffer from trivial solutions and distributional shift issues. To address these issues, we introduce a simple yet practical objective function for time series explainable learning. The design of the objective function builds upon the principle of information bottleneck (IB), and modifies the IB objective function to avoid trivial solutions and distributional shift issues. We further present TimeX++, a novel explanation framework that leverages a parametric network to produce explanation-embedded instances that are both in-distributed and label-preserving. We evaluate TimeX++ on both synthetic and real-world datasets comparing its performance against leading baselines, and validate its practical efficacy through case studies in a real-world environmental application. Quantitative and qualitative evaluations show that TimeX++ outperforms baselines across all datasets, demonstrating a substantial improvement in explanation quality for time series data. The source code is available at \url{https://github.com/zichuan-liu/TimeXplusplus}.

LGOct 29, 2023
The Power of Explainability in Forecast-Informed Deep Learning Models for Flood Mitigation

Jimeng Shi, Vitalii Stebliankin, Giri Narasimhan

Floods can cause horrific harm to life and property. However, they can be mitigated or even avoided by the effective use of hydraulic structures such as dams, gates, and pumps. By pre-releasing water via these structures in advance of extreme weather events, water levels are sufficiently lowered to prevent floods. In this work, we propose FIDLAR, a Forecast Informed Deep Learning Architecture, achieving flood management in watersheds with hydraulic structures in an optimal manner by balancing out flood mitigation and unnecessary wastage of water via pre-releases. We perform experiments with FIDLAR using data from the South Florida Water Management District, which manages a coastal area that is highly prone to frequent storms and floods. Results show that FIDLAR performs better than the current state-of-the-art with several orders of magnitude speedup and with provably better pre-release schedules. The dramatic speedups make it possible for FIDLAR to be used for real-time flood management. The main contribution of this paper is the effective use of tools for model explainability, allowing us to understand the contribution of the various environmental factors towards its decisions.

LGMay 19
Accurate, Efficient, and Explainable Deep Learning Approaches for Environmental Science Problems

Jimeng Shi

Environmental science plays a pivotal role in safeguarding ecosystems, a domain driven by large-scale, heterogeneous data. In the big data era, artificial intelligence (AI) has emerged as a transformative tool for learning patterns and supporting decision-making. This dissertation develops AI-based approaches tailored to complex environmental science problems to achieve Environmental Intelligence, studying three specific challenges. First, we focus on flood prediction and management in coastal river systems. Conventional physics-based models are computationally intensive, limiting real-time application. To overcome this, we propose a deep learning (DL)-based model, WaLeF, for water level forecasting, and a forecast-informed DL model, FIDLAr, to manage water levels. Evaluated in a flood-prone coastal system in South Florida characterized by extreme rainfall and sea level fluctuations, FIDLAr outperforms baselines in accuracy and efficiency while providing interpretable outputs. Second, we target global weather prediction, which is challenged by massive data scale. Traditional physics methods are deterministic and computationally heavy. We propose CoDiCast, a conditional diffusion model tailored for probabilistic weather forecasting. Adapted from generative AI for predictive tasks, experiments show CoDiCast achieves accurate, efficient forecasts with explicit uncertainty quantification. Lastly, we address scientific question-answering in environmental science. When answering in-domain questions, large language models (LLMs) often suffer from hallucinations due to out-of-date or limited knowledge. While retrieval-augmented generation (RAG) retrieves domain-specific knowledge, existing methods trade off accuracy, efficiency, or explainability. We propose Hypercube-RAG, built on a structured text cube framework, which successfully exhibits all three properties simultaneously.

LGJan 12, 2025Code
Deep Learning and Foundation Models for Weather Prediction: A Survey

Jimeng Shi, Azam Shirali, Bowen Jin et al.

Physics-based numerical models have been the bedrock of atmospheric sciences for decades, offering robust solutions but often at the cost of significant computational resources. Deep learning (DL) models have emerged as powerful tools in meteorology, capable of analyzing complex weather and climate data by learning intricate dependencies and providing rapid predictions once trained. While these models demonstrate promising performance in weather prediction, often surpassing traditional physics-based methods, they still face critical challenges. This paper presents a comprehensive survey of recent deep learning and foundation models for weather prediction. We propose a taxonomy to classify existing models based on their training paradigms: deterministic predictive learning, probabilistic generative learning, and pre-training and fine-tuning. For each paradigm, we delve into the underlying model architectures, address major challenges, offer key insights, and propose targeted directions for future research. Furthermore, we explore real-world applications of these methods and provide a curated summary of open-source code repositories and widely used datasets, aiming to bridge research advancements with practical implementations while fostering open and trustworthy scientific practices in adopting cutting-edge artificial intelligence for weather prediction. The related sources are available at https://github.com/JimengShi/ DL-Foundation-Models-Weather.

CLMar 10
TaSR-RAG: Taxonomy-guided Structured Reasoning for Retrieval-Augmented Generation

Jiashuo Sun, Yixuan Xie, Jimeng Shi et al.

Retrieval-Augmented Generation (RAG) helps large language models (LLMs) answer knowledge-intensive and time-sensitive questions by conditioning generation on external evidence. However, most RAG systems still retrieve unstructured chunks and rely on one-shot generation, which often yields redundant context, low information density, and brittle multi-hop reasoning. While structured RAG pipelines can improve grounding, they typically require costly and error-prone graph construction or impose rigid entity-centric structures that do not align with the query's reasoning chain. We propose \textsc{TaSR-RAG}, a taxonomy-guided structured reasoning framework for evidence selection. We represent both queries and documents as relational triples, and constrain entity semantics with a lightweight two-level taxonomy to balance generalization and precision. Given a complex question, \textsc{TaSR-RAG} decomposes it into an ordered sequence of triple sub-queries with explicit latent variables, then performs step-wise evidence selection via hybrid triple matching that combines semantic similarity over raw triples with structural consistency over typed triples. By maintaining an explicit entity binding table across steps, \textsc{TaSR-RAG} resolves intermediate variables and reduces entity conflation without explicit graph construction or exhaustive search. Experiments on multiple multi-hop question answering benchmarks show that \textsc{TaSR-RAG} consistently outperforms strong RAG and structured-RAG baselines by up to 14\%, while producing clearer evidence attribution and more faithful reasoning traces.

LGMay 2, 2025Code
How Effective are Large Time Series Models in Hydrology? A Study on Water Level Forecasting in Everglades

Rahuul Rangaraj, Jimeng Shi, Azam Shirali et al.

The Everglades play a crucial role in flood and drought regulation, water resource planning, and ecosystem management in the surrounding regions. However, traditional physics-based and statistical methods for predicting water levels often face significant challenges, including high computational costs and limited adaptability to diverse or unforeseen conditions. Recent advancements in large time series models have demonstrated the potential to address these limitations, with state-of-the-art deep learning and foundation models achieving remarkable success in time series forecasting across various domains. Despite this progress, their application to critical environmental systems, such as the Everglades, remains underexplored. In this study, we fill the gap by investigating twelve task-specific models and five time series foundation models across six categories for a real-world application focused on water level prediction in the Everglades. Our primary results show that the foundation model Chronos significantly outperforms all other models while the remaining foundation models exhibit relatively poor performance. We also noticed that the performance of task-specific models varies with the model architectures, and discussed the possible reasons. We hope our study and findings will inspire the community to explore the applicability of large time series models in hydrological applications. The code and data are available at https://github.com/rahuul2992000/Everglades-Benchmark.

LGAug 6, 2025Code
Retrieval-Augmented Water Level Forecasting for Everglades

Rahuul Rangaraj, Jimeng Shi, Rajendra Paudel et al.

Accurate water level forecasting is crucial for managing ecosystems such as the Everglades, a subtropical wetland vital for flood mitigation, drought management, water resource planning, and biodiversity conservation. While recent advances in deep learning, particularly time series foundation models, have demonstrated success in general-domain forecasting, their application in hydrology remains underexplored. Furthermore, they often struggle to generalize across diverse unseen datasets and domains, due to the lack of effective mechanisms for adaptation. To address this gap, we introduce Retrieval-Augmented Forecasting (RAF) into the hydrology domain, proposing a framework that retrieves historically analogous multivariate hydrological episodes to enrich the model input before forecasting. By maintaining an external archive of past observations, RAF identifies and incorporates relevant patterns from historical data, thereby enhancing contextual awareness and predictive accuracy without requiring the model for task-specific retraining or fine-tuning. Furthermore, we explore and compare both similarity-based and mutual information-based RAF methods. We conduct a comprehensive evaluation on real-world data from the Everglades, demonstrating that the RAF framework yields substantial improvements in water level forecasting accuracy. This study highlights the potential of RAF approaches in environmental hydrology and paves the way for broader adoption of adaptive AI methods by domain experts in ecosystem management. The code and data are available at https://github.com/rahuul2992000/WaterRAF.

LGMay 25, 2025Code
Hypercube-Based Retrieval-Augmented Generation for Scientific Question-Answering

Jimeng Shi, Sizhe Zhou, Bowen Jin et al.

Large language models (LLMs) often need to incorporate external knowledge to solve theme-specific problems. Retrieval-augmented generation (RAG) has shown its high promise, empowering LLMs to generate more qualified responses with retrieved external data and knowledge. However, most RAG methods retrieve relevant documents based on either sparse or dense retrieval methods or their combinations, which overlooks the essential, multi-dimensional, and structured semantic information present in documents. This structured information plays a critical role in finding concise yet highly relevant information for domain knowledge-intensive tasks, such as scientific question-answering (QA). In this work, we introduce a multi-dimensional (cube) structure, Hypercube, which can index and allocate documents in a pre-defined multi-dimensional space. Built on the hypercube, we further propose Hypercube-RAG, a novel RAG framework for precise and efficient retrieval. Given a query, Hypercube-RAG first decomposes it based on its entities, phrases, and topics along with pre-defined hypercube dimensions, and then retrieves relevant documents from cubes by aligning these decomposed components with corresponding dimensions. Experiments on three datasets across different domains demonstrate that our method improves response accuracy by 3.7% and retrieval accuracy by 5.3% over the strongest RAG baseline. It also boosts retrieval efficiency (speed) by one or two magnitudes faster than graph-based RAG. Notably, our Hypercube-RAG inherently offers explainability by revealing those underlying dimensions used for retrieval. The code and data are available at https://github.com/JimengShi/Hypercube-RAG.

LGFeb 20, 2024
FIDLAR: Forecast-Informed Deep Learning Architecture for Flood Mitigation

Jimeng Shi, Zeda Yin, Arturo Leon et al.

In coastal river systems, frequent floods, often occurring during major storms or king tides, pose a severe threat to lives and property. However, these floods can be mitigated or even prevented by strategically releasing water before extreme weather events with hydraulic structures such as dams, gates, pumps, and reservoirs. A standard approach used by local water management agencies is the "rule-based" method, which specifies predetermined pre-releases of water based on historical and time-tested human experience, but which tends to result in excess or inadequate water release. The model predictive control (MPC), a physics-based model for prediction, is an alternative approach, albeit involving computationally intensive calculations. In this paper, we propose a Forecast Informed Deep Learning Architecture, FIDLAR, to achieve rapid and optimal flood management with precise water pre-releases. FIDLAR seamlessly integrates two neural network modules: one called the Flood Manager, which is responsible for generating water pre-release schedules, and another called the Flood Evaluator, which assesses these generated schedules. The Evaluator module is pre-trained separately, and its gradient-based feedback is used to train the Manager model, ensuring optimal water pre-releases. We have conducted experiments using FIDLAR with data from a flood-prone coastal area in South Florida, particularly susceptible to frequent storms. Results show that FIDLAR is several orders of magnitude faster than currently used physics-based approaches while outperforming baseline methods with improved water pre-release schedules.

LGJun 4, 2025
SF$^2$Bench: Evaluating Data-Driven Models for Compound Flood Forecasting in South Florida

Xu Zheng, Chaohao Lin, Sipeng Chen et al.

Forecasting compound floods presents a significant challenge due to the intricate interplay of meteorological, hydrological, and oceanographic factors. Analyzing compound floods has become more critical as the global climate increases flood risks. Traditional physics-based methods, such as the Hydrologic Engineering Center's River Analysis System, are often time-inefficient. Machine learning has recently demonstrated promise in both modeling accuracy and computational efficiency. However, the scarcity of comprehensive datasets currently hinders systematic analysis. Existing water-related datasets are often limited by a sparse network of monitoring stations and incomplete coverage of relevant factors. To address this challenge, we introduce SF2Bench, a comprehensive time series collection on compound floods in South Florida, which integrates four key factors: tide, rainfall, groundwater, and human management activities (gate and pump controlling). This integration allows for a more detailed analysis of the individual contributions of these drivers to compound flooding and informs the development of improved flood forecasting approaches. To comprehensively evaluate the potential of various modeling paradigms, we assess the performance of six categories of methods, encompassing Multilayer Perceptrons, Convolutional Neural Networks, Recurrent Neural Networks, Graph Neural Networks, Transformers, and Large Language Models. We verified the impact of different key features on flood forecasting through experiments. Our analysis examines temporal and spatial aspects, providing insights into the influence of historical data and spatial dependencies. The varying performance across these approaches underscores the diverse capabilities of each in capturing complex temporal and spatial dependencies inherent in compound floods.

LGMay 8, 2023
Explainable Parallel RCNN with Novel Feature Representation for Time Series Forecasting

Jimeng Shi, Rukmangadh Myana, Vitalii Stebliankin et al.

Accurate time series forecasting is a fundamental challenge in data science. It is often affected by external covariates such as weather or human intervention, which in many applications, may be predicted with reasonable accuracy. We refer to them as predicted future covariates. However, existing methods that attempt to predict time series in an iterative manner with autoregressive models end up with exponential error accumulations. Other strategies hat consider the past and future in the encoder and decoder respectively limit themselves by dealing with the historical and future data separately. To address these limitations, a novel feature representation strategy -- shifting -- is proposed to fuse the past data and future covariates such that their interactions can be considered. To extract complex dynamics in time series, we develop a parallel deep learning framework composed of RNN and CNN, both of which are used hierarchically. We also utilize the skip connection technique to improve the model's performance. Extensive experiments on three datasets reveal the effectiveness of our method. Finally, we demonstrate the model interpretability using the Grad-CAM algorithm.

SYMay 1, 2021
Computational Simulation and Analysis of Major Control Parameters of Time-Dependent PV/T Collectors

Jimeng Shi, Cheng-Xian Lin

In order to improve performance of photovoltaic/thermal (or PV/T for simplicity) collectors, this paper firstly validated a previous computational thermal model and then introduced an improved computational thermal model to investigate the effects of the major control parameters on the thermal performance of PV/T collectors, including solar cell temperature, back surface temperature, and outlet water temperature. Besides, a computational electrical model of PV/T system was also introduced to elaborate the relationship of voltage, current and power of a PV module (MSX60 polycrystalline solar cell) used in an experiment in the literature. Simulation results agree with the experimental data very well. The effects of the time-steps from 1 hour to minute, which is closed to the real time, were also reported. At last, several suggestions to improve the efficiency of PV/T system were illustrated.