Samuel Myren

h-index2

3papers

8citations

Novelty30%

AI Score29

Ranked #143,629 of 194,257 authors (top 74%)#31,631 in LG (top 79%)

3 Papers

11.4LGJan 15, 2025

Evaluation of Seismic Artificial Intelligence with Uncertainty

Samuel Myren, Nidhi Parikh, Rosalyn Rael et al.

Artificial intelligence has transformed the seismic community with deep learning models (DLMs) that are trained to complete specific tasks within workflows. However, there is still lack of robust evaluation frameworks for evaluating and comparing DLMs. We address this gap by designing an evaluation framework that jointly incorporates two crucial aspects: performance uncertainty and learning efficiency. To target these aspects, we meticulously construct the training, validation, and test splits using a clustering method tailored to seismic data and enact an expansive training design to segregate performance uncertainty arising from stochastic training processes and random data sampling. The framework's ability to guard against misleading declarations of model superiority is demonstrated through evaluation of PhaseNet [1], a popular seismic phase picking DLM, under 3 training approaches. Our framework helps practitioners choose the best model for their problem and set performance expectations by explicitly analyzing model performance with uncertainty at varying budgets of training data.

9.5HCJan 28, 2025

The Trust Calibration Maturity Model for Characterizing and Communicating Trustworthiness of AI Systems

Scott T Steinmetz, Asmeret Naugle, Paul Schutte et al.

Recent proliferation of powerful AI systems has created a strong need for capabilities that help users to calibrate trust in those systems. As AI systems grow in scale, information required to evaluate their trustworthiness becomes less accessible, presenting a growing risk of using these systems inappropriately. We propose the Trust Calibration Maturity Model (TCMM) to characterize and communicate information about AI system trustworthiness. The TCMM incorporates five dimensions of analytic maturity: Performance Characterization, Bias & Robustness Quantification, Transparency, Safety & Security, and Usability. The TCMM can be presented along with system performance information to (1) help a user to appropriately calibrate trust, (2) establish requirements and track progress, and (3) identify research needs. Here, we discuss the TCMM and demonstrate it on two target tasks: using ChatGPT for high consequence nuclear science determinations, and using PhaseNet (an ensemble of seismic models) for categorizing sources of seismic events.

1.4LGJan 13

Meta-learning to Address Data Shift in Time Series Classification

Samuel Myren, Nidhi Parikh, Natalie Klein

Across engineering and scientific domains, traditional deep learning (TDL) models perform well when training and test data share the same distribution. However, the dynamic nature of real-world data, broadly termed \textit{data shift}, renders TDL models prone to rapid performance degradation, requiring costly relabeling and inefficient retraining. Meta-learning, which enables models to adapt quickly to new data with few examples, offers a promising alternative for mitigating these challenges. Here, we systematically compare TDL with fine-tuning and optimization-based meta-learning algorithms to assess their ability to address data shift in time-series classification. We introduce a controlled, task-oriented seismic benchmark (SeisTask) and show that meta-learning typically achieves faster and more stable adaptation with reduced overfitting in data-scarce regimes and smaller model architectures. As data availability and model capacity increase, its advantages diminish, with TDL with fine-tuning performing comparably. Finally, we examine how task diversity influences meta-learning and find that alignment between training and test distributions, rather than diversity alone, drives performance gains. Overall, this work provides a systematic evaluation of when and why meta-learning outperforms TDL under data shift and contributes SeisTask as a benchmark for advancing adaptive learning research in time-series domains.