Improving Biomedical Entity Linking with Retrieval-enhanced Learning
This work addresses a specific bottleneck in biomedical entity linking for researchers and practitioners, offering an incremental improvement over existing methods.
The paper tackled the problem of handling rare and difficult entities in biomedical entity linking due to long-tailed distributions by introducing kNN-BioEL, which references similar training instances for prediction, and it outperformed state-of-the-art baselines on several datasets.
Biomedical entity linking (BioEL) has achieved remarkable progress with the help of pre-trained language models. However, existing BioEL methods usually struggle to handle rare and difficult entities due to long-tailed distribution. To address this limitation, we introduce a new scheme $k$NN-BioEL, which provides a BioEL model with the ability to reference similar instances from the entire training corpus as clues for prediction, thus improving the generalization capabilities. Moreover, we design a contrastive learning objective with dynamic hard negative sampling (DHNS) that improves the quality of the retrieved neighbors during inference. Extensive experimental results show that $k$NN-BioEL outperforms state-of-the-art baselines on several datasets.