Bio-YODIE: A Named Entity Linking System for Biomedical Text
This work addresses the need for scalable and robust named entity linking in biomedical text to support tasks like literature curation and patient record annotation, but it is incremental as it builds on existing systems.
The paper tackles the challenge of linking biomedical text mentions to knowledge bases like UMLS, presenting Bio-YODIE and comparing it to other systems to guide approach selection for varying scenarios and future needs.
Ever-expanding volumes of biomedical text require automated semantic annotation techniques to curate and put to best use. An established field of research seeks to link mentions in text to knowledge bases such as those included in the UMLS (Unified Medical Language System), in order to enable a more sophisticated understanding. This work has yielded good results for tasks such as curating literature, but increasingly, annotation systems are more broadly applied. Medical vocabularies are expanding in size, and with them the extent of term ambiguity. Document collections are increasing in size and complexity, creating a greater need for speed and robustness. Furthermore, as the technologies are turned to new tasks, requirements change; for example greater coverage of expressions may be required in order to annotate patient records, and greater accuracy may be needed for applications that affect patients. This places new demands on the approaches currently in use. In this work, we present a new system, Bio-YODIE, and compare it to two other popular systems in order to give guidance about suitable approaches in different scenarios and how systems might be designed to accommodate future needs.