Paul Johnson

4.3GEO-PHDec 12, 2019

Attention network forecasts time-to-failure in laboratory shear experiments

Hope Jasperson, David C. Bolton, Paul Johnson et al.

Rocks under stress deform by creep mechanisms that include formation and slip on small-scale internal cracks. Intragranular cracks and slip along grain contacts release energy as elastic waves termed acoustic emissions (AE). AEs are thought to contain predictive information that can be used for fault failure forecasting. Here we present a method using unsupervised classification and an attention network to forecast labquakes using AE waveform features. Our data were generated in a laboratory setting using a biaxial shearing device with granular fault gouge intended to mimic the conditions of tectonic faults. Here we analyzed the temporal evolution of AEs generated throughout several hundred laboratory earthquake cycles. We used a Conscience Self-Organizing Map (CSOM) to perform topologically ordered vector quantization based on waveform properties. The resulting map was used to interactively cluster AEs. We examined the clusters over time to identify those with predictive ability. Finally, we used a variety of LSTM and attention-based networks to test the predictive power of the AE clusters. By tracking cumulative waveform features over the seismic cycle, the network is able to forecast the time-to-failure (TTF) of lab earthquakes. Our results show that analyzing the data to isolate predictive signals and using a more sophisticated network architecture are key to robustly forecasting labquakes. In the future, this method could be applied on tectonic faults monitor earthquakes and augment current early warning systems.

8.1MLApr 8, 2014

Data mining for censored time-to-event data: A Bayesian network model for predicting cardiovascular risk from electronic health record data

Sunayan Bandyopadhyay, Julian Wolfson, David M. Vock et al.

Models for predicting the risk of cardiovascular events based on individual patient characteristics are important tools for managing patient care. Most current and commonly used risk prediction models have been built from carefully selected epidemiological cohorts. However, the homogeneity and limited size of such cohorts restricts the predictive power and generalizability of these risk models to other populations. Electronic health data (EHD) from large health care systems provide access to data on large, heterogeneous, and contemporaneous patient populations. The unique features and challenges of EHD, including missing risk factor information, non-linear relationships between risk factors and cardiovascular event outcomes, and differing effects from different patient subgroups, demand novel machine learning approaches to risk model development. In this paper, we present a machine learning approach based on Bayesian networks trained on EHD to predict the probability of having a cardiovascular event within five years. In such data, event status may be unknown for some individuals as the event time is right-censored due to disenrollment and incomplete follow-up. Since many traditional data mining methods are not well-suited for such data, we describe how to modify both modelling and assessment techniques to account for censored observation times. We show that our approach can lead to better predictive performance than the Cox proportional hazards model (i.e., a regression-based approach commonly used for censored, time-to-event data) or a Bayesian network with {\em{ad hoc}} approaches to right-censoring. Our techniques are motivated by and illustrated on data from a large U.S. Midwestern health care system.

Paul Johnson

2 Papers