An Unsupervised Machine Learning Approach for Ground-Motion Spectra Clustering and Selection
This work addresses ground-motion selection for structural engineering, offering an incremental improvement by integrating domain knowledge into a machine learning framework.
The paper tackles the problem of selecting representative earthquake ground-motion spectra for engineering design by developing an unsupervised machine learning algorithm that uses an autoencoder to discover latent features and combine them with traditional intensity measures for clustering. The method is validated on synthetic and field datasets, showing excellent performance compared to a benchmark seismic hazard analysis.
Clustering analysis of sequence data continues to address many applications in engineering design, aided with the rapid growth of machine learning in applied science. This paper presents an unsupervised machine learning algorithm to extract defining characteristics of earthquake ground-motion spectra, also called latent features, to aid in ground-motion selection (GMS). In this context, a latent feature is a low-dimensional machine-discovered spectral characteristic learned through nonlinear relationships of a neural network autoencoder. Machine discovered latent features can be combined with traditionally defined intensity measures and clustering can be performed to select a representative subgroup from a large ground-motion suite. The objective of efficient GMS is to choose characteristic records representative of what the structure will probabilistically experience in its lifetime. Three examples are presented to validate this approach, including the use of synthetic and field recorded ground-motion datasets. The presented deep embedding clustering of ground-motion spectra has three main advantages: 1. defining characteristics the represent the sparse spectral content of ground-motions are discovered efficiently through training of the autoencoder, 2. domain knowledge is incorporated into the machine learning framework with conditional variables in the deep embedding scheme, and 3. method exhibits excellent performance when compared to a benchmark seismic hazard analysis.