Graph-Augmented Cyclic Learning Framework for Similarity Estimation of Medical Clinical Notes
This work addresses the problem of accurately estimating similarity in clinical texts for medical professionals, representing an incremental improvement over existing methods.
The paper tackles the challenge of semantic textual similarity in clinical notes by introducing a graph-augmented cyclic learning framework that leverages domain knowledge through co-training with a GCN, improving the Bio-clinical BERT baseline by 16.3% and 27.9%.
Semantic textual similarity (STS) in the clinical domain helps improve diagnostic efficiency and produce concise texts for downstream data mining tasks. However, given the high degree of domain knowledge involved in clinic text, it remains challenging for general language models to infer implicit medical relationships behind clinical sentences and output similarities correctly. In this paper, we present a graph-augmented cyclic learning framework for similarity estimation in the clinical domain. The framework can be conveniently implemented on a state-of-art backbone language model, and improve its performance by leveraging domain knowledge through co-training with an auxiliary graph convolution network (GCN) based network. We report the success of introducing domain knowledge in GCN and the co-training framework by improving the Bio-clinical BERT baseline by 16.3% and 27.9%, respectively.