LGAIMar 21, 2025

DiTEC-WDN: A Large-Scale Dataset of Hydraulic Scenarios across Multiple Water Distribution Networks

arXiv:2503.17167v25 citationsh-index: 4Sci Data
Originality Synthesis-oriented
AI Analysis

This provides a synthetic benchmark for researchers in water infrastructure to conduct open scientific research without privacy risks, though it is incremental as it addresses a known data bottleneck.

The authors tackled the problem of limited access to real-world Water Distribution Network (WDN) models due to privacy restrictions by creating DiTEC-WDN, a large-scale dataset of 36,000 unique hydraulic scenarios with 228 million graph-based states, enabling data-driven machine learning applications in the water sector.

Privacy restrictions hinder the sharing of real-world Water Distribution Network (WDN) models, limiting the application of emerging data-driven machine learning, which typically requires extensive observations. To address this challenge, we propose the dataset DiTEC-WDN that comprises 36,000 unique scenarios simulated over either short-term (24 hours) or long-term (1 year) periods. We constructed this dataset using an automated pipeline that optimizes crucial parameters (e.g., pressure, flow rate, and demand patterns), facilitates large-scale simulations, and records discrete, synthetic but hydraulically realistic states under standard conditions via rule validation and post-hoc analysis. With a total of 228 million generated graph-based states, DiTEC-WDN can support a variety of machine-learning tasks, including graph-level, node-level, and link-level regression, as well as time-series forecasting. This contribution, released under a public license, encourages open scientific research in the critical water sector, eliminates the risk of exposing sensitive data, and fulfills the need for a large-scale water distribution network benchmark for study comparisons and scenario analysis.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes