CLLGNov 12, 2021

Unifying Heterogeneous Electronic Health Records Systems via Text-Based Code Embedding

arXiv:2111.09098v43 citations
Originality Incremental advance
AI Analysis

This addresses interoperability issues in EHR systems for healthcare providers and researchers, enabling broader deployment of deep learning models across clinics and hospitals, though it is incremental as it builds on existing neural language models.

The paper tackled the lack of a unified code system in electronic health records (EHR) by introducing DescEmb, a text-based embedding framework that uses textual descriptions to represent clinical events, which outperformed traditional code-based methods in zero-shot transfer tasks and enabled a single model for heterogeneous datasets.

EHR systems lack a unified code system forrepresenting medical concepts, which acts asa barrier for the deployment of deep learningmodels in large scale to multiple clinics and hos-pitals. To overcome this problem, we introduceDescription-based Embedding,DescEmb, a code-agnostic representation learning framework forEHR. DescEmb takes advantage of the flexibil-ity of neural language understanding models toembed clinical events using their textual descrip-tions rather than directly mapping each event toa dedicated embedding. DescEmb outperformedtraditional code-based embedding in extensiveexperiments, especially in a zero-shot transfertask (one hospital to another), and was able totrain a single unified model for heterogeneousEHR datasets.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes