LGJan 25, 2023
Accelerating Domain-aware Deep Learning Models with Distributed TrainingAishwarya Sarkar, Chaoqun Lu, Ali Jannesari
Recent advances in data-generating techniques led to an explosive growth of geo-spatiotemporal data. In domains such as hydrology, ecology, and transportation, interpreting the complex underlying patterns of spatiotemporal interactions with the help of deep learning techniques hence becomes the need of the hour. However, applying deep learning techniques without domain-specific knowledge tends to provide sub-optimal prediction performance. Secondly, training such models on large-scale data requires extensive computational resources. To eliminate these challenges, we present a novel distributed domain-aware spatiotemporal network that utilizes domain-specific knowledge with improved model performance. Our network consists of a pixel-contribution block, a distributed multiheaded multichannel convolutional (CNN) spatial block, and a recurrent temporal block. We choose flood prediction in hydrology as a use case to test our proposed method. From our analysis, the network effectively predicts high peaks in discharge measurements at watershed outlets with up to 4.1x speedup and increased prediction performance of up to 93\%. Our approach achieved a 12.6x overall speedup and increased the mean prediction performance by 16\%. We perform extensive experiments on a dataset of 23 watersheds in a northern state of the U.S. and present our findings.
LGSep 2, 2025Code
HydroGAT: Distributed Heterogeneous Graph Attention Transformer for Spatiotemporal Flood PredictionAishwarya Sarkar, Autrin Hakimi, Xiaoqiong Chen et al.
Accurate flood forecasting remains a challenge for water-resource management, as it demands modeling of local, time-varying runoff drivers (e.g., rainfall-induced peaks, baseflow trends) and complex spatial interactions across a river network. Traditional data-driven approaches, such as convolutional networks and sequence-based models, ignore topological information about the region. Graph Neural Networks (GNNs) propagate information exactly along the river network, which is ideal for learning hydrological routing. However, state-of-the-art GNN-based flood prediction models collapse pixels to coarse catchment polygons as the cost of training explodes with graph size and higher resolution. Furthermore, most existing methods treat spatial and temporal dependencies separately, either applying GNNs solely on spatial graphs or transformers purely on temporal sequences, thus failing to simultaneously capture spatiotemporal interactions critical for accurate flood prediction. We introduce a heterogenous basin graph where every land and river pixel is a node connected by physical hydrological flow directions and inter-catchment relationships. We propose HydroGAT, a spatiotemporal network that adaptively learns local temporal importance and the most influential upstream locations. Evaluated in two Midwestern US basins and across five baseline architectures, our model achieves higher NSE (up to 0.97), improved KGE (up to 0.96), and low bias (PBIAS within $\pm$5%) in hourly discharge prediction, while offering interpretable attention maps that reveal sparse, structured intercatchment influences. To support high-resolution basin-scale training, we develop a distributed data-parallel pipeline that scales efficiently up to 64 NVIDIA A100 GPUs on NERSC Perlmutter supercomputer, demonstrating up to 15x speedup across machines. Our code is available at https://github.com/swapp-lab/HydroGAT.
LGOct 2, 2021
Transfer Learning Approaches for Knowledge Discovery in Grid-based Geo-Spatiotemporal DataAishwarya Sarkar, Jien Zhang, Chaoqun Lu et al.
Extracting and meticulously analyzing geo-spatiotemporal features is crucial to recognize intricate underlying causes of natural events, such as floods. Limited evidence about hidden factors leading to climate change makes it challenging to predict regional water discharge accurately. In addition, the explosive growth in complex geo-spatiotemporal environment data that requires repeated learning by the state-of-the-art neural networks for every new region emphasizes the need for new computationally efficient methods, advanced computational resources, and extensive training on a massive amount of available monitored data. We, therefore, propose HydroDeep, an effectively reusable pretrained model to address this problem of transferring knowledge from one region to another by effectively capturing their intrinsic geo-spatiotemporal variance. Further, we present four transfer learning approaches on HydroDeep for spatiotemporal interpretability that improve Nash-Sutcliffe efficiency by 9% to 108% in new regions with a 95% reduction in time.
LGOct 9, 2020
HydroDeep -- A Knowledge Guided Deep Neural Network for Geo-Spatiotemporal Data AnalysisAishwarya Sarkar, Jien Zhang, Chaoqun Lu et al.
Due to limited evidence and complex causes of regional climate change, the confidence in predicting fluvial floods remains low. Understanding the fundamental mechanisms intrinsic to geo-spatiotemporal information is crucial to improve the prediction accuracy. This paper demonstrates a hybrid neural network architecture - HydroDeep, that couples a process-based hydro-ecological model with a combination of Deep Convolutional Neural Network (CNN) and Long Short-Term Memory (LSTM) Network. HydroDeep outperforms the independent CNN's and LSTM's performance by 1.6% and 10.5% respectively in Nash-Sutcliffe efficiency. Also, we show that HydroDeep pre-trained in one region is adept at passing on its knowledge to distant places via unique transfer learning approaches that minimize HydroDeep's training duration for a new region by learning its regional geo-spatiotemporal features in a reduced number of iterations.