LGNov 19, 2024

PyAWD: A Library for Generating Large Synthetic Datasets of Acoustic Wave Propagation

arXiv:2411.12636v2h-index: 1
Originality Synthesis-oriented
AI Analysis

This tool addresses data limitations in earthquake analysis for seismologists and ML researchers, though it is incremental as it builds on existing simulation methods.

The authors tackled the scarcity of seismic data for machine learning by developing PyAWD, a Python library that generates large synthetic datasets of acoustic wave propagation, enabling tasks like epicenter retrieval with fine control over parameters.

Seismic data is often sparse and unevenly distributed due to the high costs and logistical challenges associated with deploying physical seismometers, limiting the application of Machine Learning (ML) in earthquake analysis. While simulation methods exist, no tool allows the generation of large datasets containing simulated measurements of the ground motion. To address this gap, we introduce PyAWD, a Python library designed to generate high-resolution synthetic datasets simulating spatio-temporal acoustic wave propagation in both two-dimensional and three-dimensional heterogeneous media. By allowing fine control over parameters such as the wave speed, external forces, spatial and temporal discretization, and media composition, PyAWD enables the creation of ML-scale datasets that capture the complexity of seismic wave behavior. We illustrate the library's potential with an epicenter retrieval task, showcasing its suitability for designing complex, accurate seismic problems that require advanced ML approaches in the absence or lack of dense real-world data. We also show the usefulness of our tool to tackle the problem of data budgeting in the framework of epicenter retrieval.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes