CL IRAug 16, 2021

Hybrid deep learning methods for phenotype prediction from clinical notes

Sahar Khalafi, Nasser Ghadiri, Milad Moradi

arXiv:2108.10682v30.23 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of phenotype prediction for clinical information management, but it is incremental as it builds on existing NLP and deep learning methods.

The paper tackled the problem of automatically extracting patient phenotypes from clinical notes by proposing a hybrid deep learning model combining neural bidirectional sequence models and CNN layers, achieving significant performance improvements over existing models, with an enhanced version showing a relatively higher F1-score.

Identifying patient cohorts from clinical notes in secondary electronic health records is a fundamental task in clinical information management. However, with the growing number of clinical notes, it becomes challenging to analyze the data manually for phenotype detection. Automatic extraction of clinical concepts would helps to identify the patient phenotypes correctly. This paper proposes a novel hybrid model for automatically extracting patient phenotypes using natural language processing and deep learning models to determine the patient phenotypes without dictionaries and human intervention. The model is based on a neural bidirectional sequence model (BiLSTM or BiGRU) and a CNN layer for phenotypes identification. An extra CNN layer is run parallel to the hybrid model to extract more features related to each phenotype. We used pre-trained embeddings such as FastText and Word2vec separately as the input layers to evaluate other embedding's performance. Experimental results using MIMIC III database in internal comparison demonstrate that the proposed model achieved significant performance improvement over existing models. The enhanced version of our model with an extra CNN layer obtained a relatively higher F1-score than the original hybrid model. We also showed that BiGRU layer with FastText embedding had better performance than BiLSTM layer to identify patient phenotypes.

View on arXiv PDF

Similar