CLFeb 14, 2020

Understanding patient complaint characteristics using contextual clinical BERT embeddings

Budhaditya Saha, Sanal Lisboa, Shameek Ghosh

arXiv:2002.05902v10.59 citations

Originality Incremental advance

AI Analysis

This addresses a gap in clinical conversational applications for better understanding patient symptoms, though it is incremental as it builds on existing methods like BERT and LDA.

The paper tackled the problem of recognizing characterizations like time, onset, and severity in patient complaints, which are often missed by existing models, and achieved a 40-50% improvement in accuracy over state-of-the-art models.

In clinical conversational applications, extracted entities tend to capture the main subject of a patient's complaint, namely symptoms or diseases. However, they mostly fail to recognize the characterizations of a complaint such as the time, the onset, and the severity. For example, if the input is "I have a headache and it is extreme", state-of-the-art models only recognize the main symptom entity - headache, but ignore the severity factor of "extreme", that characterizes headache. In this paper, we design a two-stage approach to detect the characterizations of entities like symptoms presented by general users in contexts where they would describe their symptoms to a clinician. We use Word2Vec and BERT to encode clinical text given by the patients. We transform the output and re-frame the task as multi-label classification problem. Finally, we combine the processed encodings with the Linear Discriminant Analysis (LDA) algorithm to classify the characterizations of the main entity. Experimental results demonstrate that our method achieves 40-50% improvement on the accuracy over the state-of-the-art models.

View on arXiv PDF

Similar