CL LG MLSep 16, 2018

Development of deep learning algorithms to categorize free-text notes pertaining to diabetes: convolution neural networks achieve higher accuracy than support vector machines

arXiv:1809.05814v1

Originality Synthesis-oriented

AI Analysis

This is an incremental improvement for healthcare professionals using NLP to review EHRs, specifically for diabetes-related notes.

The researchers tackled the problem of categorizing free-text notes in electronic health records related to diabetes, and found that a convolutional neural network achieved an AUC of 0.975, outperforming support vector machines.

Health professionals can use natural language processing (NLP) technologies when reviewing electronic health records (EHR). Machine learning free-text classifiers can help them identify problems and make critical decisions. We aim to develop deep learning neural network algorithms that identify EHR progress notes pertaining to diabetes and validate the algorithms at two institutions. The data used are 2,000 EHR progress notes retrieved from patients with diabetes and all notes were annotated manually as diabetic or non-diabetic. Several deep learning classifiers were developed, and their performances were evaluated with the area under the ROC curve (AUC). The convolutional neural network (CNN) model with a separable convolution layer accurately identified diabetes-related notes in the Brigham and Womens Hospital testing set with the highest AUC of 0.975. Deep learning classifiers can be used to identify EHR progress notes pertaining to diabetes. In particular, the CNN-based classifier can achieve a higher AUC than an SVM-based classifier.

View on arXiv PDF

Similar