CLOct 24, 2018

Clinical Concept Extraction with Contextual Word Embedding

Henghui Zhu, Ioannis Ch. Paschalidis, Amir Tahmasebi

arXiv:1810.10566v24.778 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the problem of automating clinical data structuring for healthcare professionals, representing an incremental improvement with a specific performance gain.

The authors tackled clinical concept extraction from unstructured clinical notes by proposing a model that uses domain-specific contextual word embeddings and a bidirectional LSTM-CRF, achieving a 3.4% improvement in F1-score over state-of-the-art models on the I2B2 2010 dataset.

Automatic extraction of clinical concepts is an essential step for turning the unstructured data within a clinical note into structured and actionable information. In this work, we propose a clinical concept extraction model for automatic annotation of clinical problems, treatments, and tests in clinical notes utilizing domain-specific contextual word embedding. A contextual word embedding model is first trained on a corpus with a mixture of clinical reports and relevant Wikipedia pages in the clinical domain. Next, a bidirectional LSTM-CRF model is trained for clinical concept extraction using the contextual word embedding model. We tested our proposed model on the I2B2 2010 challenge dataset. Our proposed model achieved the best performance among reported baseline models and outperformed the state-of-the-art models by 3.4% in terms of F1-score.

View on arXiv PDF Code

Similar