Evaluating PhaseNet on Teleseismic Data with MsPASS
This work provides a reproducible workflow and benchmark for applying PhaseNet to teleseismic data, addressing a known limitation for seismologists.
PhaseNet, a machine-learning picker, performs poorly on teleseismic data. Retraining from scratch on 1.6 million teleseismic waveforms increased P-pick recall by 741.5% and yielded 683.9% more picks within a 0.1s residual window, but scaling the model 120x only improved precision and recall by 15.6% and 23.2% while reducing throughput by over 87%.
Numerous studies have shown that the machine-learning picker PhaseNet produces accurate P and S picks on local earthquake signals, but its performance can degrade sharply on teleseismic signals. To address this limitation, we present a reproducible MsPASS workflow that (i) enables scalable data preparation and management for large seismic archives and (ii) supports standardized PhaseNet training and inference. We assembled a control dataset of 1.6 million waveforms linked to teleseismic P-wave picks made by analysts at the USArray Array Network Facility (ANF). The control dataset confirms that the PhaseNet model trained on regional signals performs poorly on these data. We then trained PhaseNet from scratch on the training split of the ANF control dataset and evaluated it on a non-overlapping held-out test split, increasing P-pick recall by 741.5% and yielding 683.9% more picks within a 0.1s residual window. We also evaluated PhaseNet across different model sizes on both CPUs and GPUs. Increasing the model size by about 120 times improved precision and recall by 15.6% and 23.2%, respectively. However, the scaled model reduced inference throughput by 87.2% on an NVIDIA A100 GPU and by 97.3% on a 128-core high-performance CPU node. These results indicate that scaling PhaseNet is more practical on GPUs than on CPUs, and that simply enlarging the model is not an efficient way to achieve large accuracy gains.