LGFeb 9, 2024

TEE4EHR: Transformer Event Encoder for Better Representation Learning in Electronic Health Records

arXiv:2402.06367v110 citationsh-index: 3Artif. Intell. Medicine
AI Analysis

This work addresses challenges in EHR data representation for clinical prediction tasks, though it is incremental as it builds on existing transformer and point process methods.

The paper tackled irregular sampling and non-random missing data in electronic health records by developing TEE4EHR, a transformer event encoder with point process loss, which outperformed state-of-the-art models in future event prediction and outcome prediction tasks on real-world EHR databases.

Irregular sampling of time series in electronic health records (EHRs) is one of the main challenges for developing machine learning models. Additionally, the pattern of missing data in certain clinical variables is not at random but depends on the decisions of clinicians and the state of the patient. Point process is a mathematical framework for analyzing event sequence data that is consistent with irregular sampling patterns. Our model, TEE4EHR, is a transformer event encoder (TEE) with point process loss that encodes the pattern of laboratory tests in EHRs. The utility of our TEE has been investigated in a variety of benchmark event sequence datasets. Additionally, we conduct experiments on two real-world EHR databases to provide a more comprehensive evaluation of our model. Firstly, in a self-supervised learning approach, the TEE is jointly learned with an existing attention-based deep neural network which gives superior performance in negative log-likelihood and future event prediction. Besides, we propose an algorithm for aggregating attention weights that can reveal the interaction between the events. Secondly, we transfer and freeze the learned TEE to the downstream task for the outcome prediction, where it outperforms state-of-the-art models for handling irregularly sampled time series. Furthermore, our results demonstrate that our approach can improve representation learning in EHRs and can be useful for clinical prediction tasks.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes