LGQMApr 15, 2022

Unsupervised Probabilistic Models for Sequential Electronic Health Records

arXiv:2204.07292v25 citationsh-index: 23
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of analyzing complex EHR data for healthcare applications, but it appears incremental as it builds on existing probabilistic modeling approaches without claiming major breakthroughs.

The authors tackled the problem of modeling heterogeneous Electronic Health Record (EHR) data by developing an unsupervised probabilistic model that handles sequences of arbitrary length, such as medications and lab results, to enable subgrouping and capture data dynamics, resulting in novel insights from complex data and application to mortality likelihood assessment.

We develop an unsupervised probabilistic model for heterogeneous Electronic Health Record (EHR) data. Utilizing a mixture model formulation, our approach directly models sequences of arbitrary length, such as medications and laboratory results. This allows for subgrouping and incorporation of the dynamics underlying heterogeneous data types. The model consists of a layered set of latent variables that encode underlying structure in the data. These variables represent subject subgroups at the top layer, and unobserved states for sequences in the second layer. We train this model on episodic data from subjects receiving medical care in the Kaiser Permanente Northern California integrated healthcare delivery system. The resulting properties of the trained model generate novel insight from these complex and multifaceted data. In addition, we show how the model can be used to analyze sequences that contribute to assessment of mortality likelihood.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes