LGMar 28, 2022

Differentiable, learnable, regionalized process-based models with physical outputs can approach state-of-the-art hydrologic prediction accuracy

arXiv:2203.14827v2213 citationsh-index: 40
Originality Incremental advance
AI Analysis

This work addresses the need for interpretable and physically consistent hydrologic predictions for water resource management and downstream applications, representing a significant but incremental advance over existing methods.

The authors tackled the challenge of predicting hydrologic variables by developing differentiable, learnable process-based models that approach the performance of state-of-the-art deep learning models like LSTM for streamflow prediction, achieving median Nash Sutcliffe efficiencies of 0.732 vs. 0.748 on one dataset and 0.715 vs. 0.722 on another, while also outputting untrained physical variables such as soil storage and evapotranspiration.

Predictions of hydrologic variables across the entire water cycle have significant value for water resource management as well as downstream applications such as ecosystem and water quality modeling. Recently, purely data-driven deep learning models like long short-term memory (LSTM) showed seemingly-insurmountable performance in modeling rainfall-runoff and other geoscientific variables, yet they cannot predict untrained physical variables and remain challenging to interpret. Here we show that differentiable, learnable, process-based models (called δ models here) can approach the performance level of LSTM for the intensively-observed variable (streamflow) with regionalized parameterization. We use a simple hydrologic model HBV as the backbone and use embedded neural networks, which can only be trained in a differentiable programming framework, to parameterize, enhance, or replace the process-based model modules. Without using an ensemble or post-processor, δ models can obtain a median Nash Sutcliffe efficiency of 0.732 for 671 basins across the USA for the Daymet forcing dataset, compared to 0.748 from a state-of-the-art LSTM model with the same setup. For another forcing dataset, the difference is even smaller: 0.715 vs. 0.722. Meanwhile, the resulting learnable process-based models can output a full set of untrained variables, e.g., soil and groundwater storage, snowpack, evapotranspiration, and baseflow, and later be constrained by their observations. Both simulated evapotranspiration and fraction of discharge from baseflow agreed decently with alternative estimates. The general framework can work with models with various process complexity and opens up the path for learning physics from big data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes