Evaluation of Seismic Artificial Intelligence with Uncertainty
This work addresses a domain-specific problem for seismic practitioners by providing an incremental evaluation framework to improve model comparison and selection.
The paper tackles the lack of robust evaluation frameworks for deep learning models in seismology by designing one that incorporates performance uncertainty and learning efficiency, demonstrating its utility in evaluating PhaseNet under different training approaches to help practitioners choose models and set performance expectations.
Artificial intelligence has transformed the seismic community with deep learning models (DLMs) that are trained to complete specific tasks within workflows. However, there is still lack of robust evaluation frameworks for evaluating and comparing DLMs. We address this gap by designing an evaluation framework that jointly incorporates two crucial aspects: performance uncertainty and learning efficiency. To target these aspects, we meticulously construct the training, validation, and test splits using a clustering method tailored to seismic data and enact an expansive training design to segregate performance uncertainty arising from stochastic training processes and random data sampling. The framework's ability to guard against misleading declarations of model superiority is demonstrated through evaluation of PhaseNet [1], a popular seismic phase picking DLM, under 3 training approaches. Our framework helps practitioners choose the best model for their problem and set performance expectations by explicitly analyzing model performance with uncertainty at varying budgets of training data.