LGAug 26, 2025

Breaking the Black Box: Inherently Interpretable Physics-Constrained Machine Learning With Weighted Mixed-Effects for Imbalanced Seismic Data

arXiv:2508.19031v2h-index: 5
Originality Incremental advance
AI Analysis

This addresses the need for transparent and reliable seismic risk assessment for engineers and policymakers, though it is incremental in combining interpretability and imbalance handling for a specific domain.

The study tackled the problem of black-box machine learning models and data imbalance in seismic ground motion modeling by developing an interpretable neural network with physics-constrained weighting and regularization, achieving a coefficient of determination of 88.48% and unbiased residuals with physically consistent variance components.

Ground motion models (GMMs) are critical for seismic risk mitigation and infrastructure design. Machine learning (ML) is increasingly applied to GMM development due to expanding strong motion databases. However, existing ML-based GMMs operate as 'black boxes,' creating opacity that undermines confidence in engineering decisions. Moreover, seismic datasets exhibit severe imbalance, with scarce large-magnitude near-field records causing systematic underprediction of critical high-hazard ground motions. Despite these limitations, research addressing both interpretability and data imbalance remains limited. This study develops an inherently interpretable neural network employing independent additive pathways with novel HazBinLoss and concurvity regularization. HazBinLoss integrates physics-constrained weighting with inverse bin count scaling to address underfitting in sparse, high-hazard regions. Concurvity regularization enforces pathway orthogonality, reducing inter-pathway correlation. The model achieves robust performance: mean squared error = 0.6235, mean absolute error = 0.6230, and coefficient of determination = 88.48%. Pathway scaling corroborates established seismological behaviors. Weighted hierarchical Student-t mixed-effects analysis demonstrates unbiased residuals with physically consistent variance partitioning: sigma components range from 0.26-0.38 (inter-event), 0.12-0.41 (inter-region), 0.58-0.71 (intra-event), and 0.68-0.89 (total). The lower inter-event and higher intra-event components have implications for non-ergodic hazard analysis. Predictions exhibit strong agreement with NGA-West2 GMMs across diverse conditions. This interpretable framework advances GMMs, establishing a transparent, physics-consistent foundation for seismic hazard and risk assessment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes