Mortality Prediction Models with Clinical Notes Using Sparse Attention at the Word and Sentence Levels
This work addresses the need for transparent and performant neural models in intensive care mortality prediction, though it is incremental as it builds on existing attention-based methods.
The study tackled in-hospital mortality prediction from clinical notes by comparing sparse and dense attention mechanisms, finding that sparse attention at the word level improved predictive performance and focused on relevant directive words, but performance degraded at the sentence level due to dropping influential sentences.
Intensive Care in-hospital mortality prediction has various clinical applications. Neural prediction models, especially when capitalising on clinical notes, have been put forward as improvement on currently existing models. However, to be acceptable these models should be performant and transparent. This work studies different attention mechanisms for clinical neural prediction models in terms of their discrimination and calibration. Specifically, we investigate sparse attention as an alternative to dense attention weights in the task of in-hospital mortality prediction from clinical notes. We evaluate the attention mechanisms based on: i) local self-attention over words in a sentence, and ii) global self-attention with a transformer architecture across sentences. We demonstrate that the sparse mechanism approach outperforms the dense one for the local self-attention in terms of predictive performance with a publicly available dataset, and puts higher attention to prespecified relevant directive words. The performance at the sentence level, however, deteriorates as sentences including the influential directive words tend to be dropped all together.