CLMay 1, 2020

Improving Broad-Coverage Medical Entity Linking with Semantic Type Prediction and Large-Scale Datasets

arXiv:2005.00460v431 citations
AI Analysis

This work addresses a bottleneck in medical entity linking for healthcare and NLP applications, though it is incremental as it builds on existing three-step approaches.

The paper tackled the problem of overgeneration of candidate concepts in medical entity linking by introducing MedType, a system that prunes irrelevant candidates using semantic type prediction, and it improved performance across five toolkits and multiple benchmarks, with pre-training on new large-scale datasets further boosting results.

Medical entity linking is the task of identifying and standardizing medical concepts referred to in an unstructured text. Most of the existing methods adopt a three-step approach of (1) detecting mentions, (2) generating a list of candidate concepts, and finally (3) picking the best concept among them. In this paper, we probe into alleviating the problem of overgeneration of candidate concepts in the candidate generation module, the most under-studied component of medical entity linking. For this, we present MedType, a fully modular system that prunes out irrelevant candidate concepts based on the predicted semantic type of an entity mention. We incorporate MedType into five off-the-shelf toolkits for medical entity linking and demonstrate that it consistently improves entity linking performance across several benchmark datasets. To address the dearth of annotated training data for medical entity linking, we present WikiMed and PubMedDS, two large-scale medical entity linking datasets, and demonstrate that pre-training MedType on these datasets further improves entity linking performance. We make our source code and datasets publicly available for medical entity linking research.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes