CLAIJan 27, 2023

A Multi-View Joint Learning Framework for Embedding Clinical Codes and Text Using Graph Neural Networks

arXiv:2301.11608v11 citationsh-index: 16
Originality Incremental advance
AI Analysis

This work addresses the problem of efficient and effective clinical text representation for healthcare applications, offering a computationally lighter alternative to large language models, though it is incremental in combining existing methods.

The paper tackles the challenge of representing clinical text for machine learning by proposing a multi-view framework that jointly learns from both clinical codes and text, using Graph Neural Networks and Bi-LSTM with Deep Canonical Correlation Analysis. The result shows that this model outperforms fine-tuned BERT models in some tasks and is competitive in others while requiring significantly less computational effort.

Learning to represent free text is a core task in many clinical machine learning (ML) applications, as clinical text contains observations and plans not otherwise available for inference. State-of-the-art methods use large language models developed with immense computational resources and training data; however, applying these models is challenging because of the highly varying syntax and vocabulary in clinical free text. Structured information such as International Classification of Disease (ICD) codes often succinctly abstracts the most important facts of a clinical encounter and yields good performance, but is often not as available as clinical text in real-world scenarios. We propose a \textbf{multi-view learning framework} that jointly learns from codes and text to combine the availability and forward-looking nature of text and better performance of ICD codes. The learned text embeddings can be used as inputs to predictive algorithms independent of the ICD codes during inference. Our approach uses a Graph Neural Network (GNN) to process ICD codes, and Bi-LSTM to process text. We apply Deep Canonical Correlation Analysis (DCCA) to enforce the two views to learn a similar representation of each patient. In experiments using planned surgical procedure text, our model outperforms BERT models fine-tuned to clinical data, and in experiments using diverse text in MIMIC-III, our model is competitive to a fine-tuned BERT at a tiny fraction of its computational effort.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes