AI CLSep 18, 2018

Lung Cancer Concept Annotation from Spanish Clinical Narratives

Marjan Najafabadipour, Juan Manuel Tuñas, Alejandro Rodríguez-González, Ernestina Menasalvas

arXiv:1809.06639v112 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge for healthcare professionals in extracting useful information from unstructured clinical notes in Spanish, but it is incremental as it applies existing methods to a new language domain.

The paper tackles the problem of extracting lung cancer concepts from Spanish clinical narratives to enable accurate query and answering, by designing annotators integrated into the Apache UIMA framework and detailing annotation generation and storage.

Recent rapid increase in the generation of clinical data and rapid development of computational science make us able to extract new insights from massive datasets in healthcare industry. Oncological clinical notes are creating rich databases for documenting patients history and they potentially contain lots of patterns that could help in better management of the disease. However, these patterns are locked within free text (unstructured) portions of clinical documents and consequence in limiting health professionals to extract useful information from them and to finally perform Query and Answering (QA) process in an accurate way. The Information Extraction (IE) process requires Natural Language Processing (NLP) techniques to assign semantics to these patterns. Therefore, in this paper, we analyze the design of annotators for specific lung cancer concepts that can be integrated over Apache Unstructured Information Management Architecture (UIMA) framework. In addition, we explain the details of generation and storage of annotation outcomes.

View on arXiv PDF

Similar