BERTSurv: BERT-Based Survival Models for Predicting Outcomes of Trauma Patients
This addresses the problem of limited deep learning methods for survival analysis in healthcare by providing a model that improves prediction accuracy for trauma patient outcomes, though it is incremental as it adapts existing BERT techniques to a specific domain.
The paper tackled predicting mortality and survival outcomes for ICU trauma patients by developing BERTSurv, a BERT-based deep learning framework that integrates clinical notes and measurements, achieving an AUC-ROC of 0.86 (3.6% improvement over baseline) for mortality prediction and a C-index of 0.7 for survival analysis.
Survival analysis is a technique to predict the times of specific outcomes, and is widely used in predicting the outcomes for intensive care unit (ICU) trauma patients. Recently, deep learning models have drawn increasing attention in healthcare. However, there is a lack of deep learning methods that can model the relationship between measurements, clinical notes and mortality outcomes. In this paper we introduce BERTSurv, a deep learning survival framework which applies Bidirectional Encoder Representations from Transformers (BERT) as a language representation model on unstructured clinical notes, for mortality prediction and survival analysis. We also incorporate clinical measurements in BERTSurv. With binary cross-entropy (BCE) loss, BERTSurv can predict mortality as a binary outcome (mortality prediction). With partial log-likelihood (PLL) loss, BERTSurv predicts the probability of mortality as a time-to-event outcome (survival analysis). We apply BERTSurv on Medical Information Mart for Intensive Care III (MIMIC III) trauma patient data. For mortality prediction, BERTSurv obtained an area under the curve of receiver operating characteristic curve (AUC-ROC) of 0.86, which is an improvement of 3.6% over baseline of multilayer perceptron (MLP) without notes. For survival analysis, BERTSurv achieved a concordance index (C-index) of 0.7. In addition, visualizations of BERT's attention heads help to extract patterns in clinical notes and improve model interpretability by showing how the model assigns weights to different inputs.