CLOct 30, 2017

Creation of an Annotated Corpus of Spanish Radiology Reports

arXiv:1710.11154v11 citations
Originality Synthesis-oriented
AI Analysis

This work addresses a domain-specific problem for researchers in biomedical NLP by providing a new dataset, but it is incremental as it applies existing annotation methods to new data.

The authors tackled the scarcity of biomedical annotated resources by creating a new annotated corpus of 513 anonymized Spanish radiology reports, manually labeled with entities, negation, uncertainty terms, and relations to serve as an evaluation resource for named entity recognition and relation extraction algorithms.

This paper presents a new annotated corpus of 513 anonymized radiology reports written in Spanish. Reports were manually annotated with entities, negation and uncertainty terms and relations. The corpus was conceived as an evaluation resource for named entity recognition and relation extraction algorithms, and as input for the use of supervised methods. Biomedical annotated resources are scarce due to confidentiality issues and associated costs. This work provides some guidelines that could help other researchers to undertake similar tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes