CL AIJul 3, 2023

Exploring the In-context Learning Ability of Large Language Model for Biomedical Concept Linking

arXiv:2307.01137v14.917 citationsh-index: 9

Originality Synthesis-oriented

AI Analysis

This work addresses concept linking for biomedical applications like literature mining and information retrieval, showing competitive performance but is incremental as it adapts existing methods to a specific domain.

The paper tackled biomedical concept linking by exploiting in-context learning of large language models in a retrieve-and-rank framework, achieving 90.0% accuracy in disease entity normalization and 94.7% in chemical entity normalization, with over 20-point F1 improvement on an oncology dataset.

The biomedical field relies heavily on concept linking in various areas such as literature mining, graph alignment, information retrieval, question-answering, data, and knowledge integration. Although large language models (LLMs) have made significant strides in many natural language processing tasks, their effectiveness in biomedical concept mapping is yet to be fully explored. This research investigates a method that exploits the in-context learning (ICL) capabilities of large models for biomedical concept linking. The proposed approach adopts a two-stage retrieve-and-rank framework. Initially, biomedical concepts are embedded using language models, and then embedding similarity is utilized to retrieve the top candidates. These candidates' contextual information is subsequently incorporated into the prompt and processed by a large language model to re-rank the concepts. This approach achieved an accuracy of 90.% in BC5CDR disease entity normalization and 94.7% in chemical entity normalization, exhibiting a competitive performance relative to supervised learning methods. Further, it showed a significant improvement, with an over 20-point absolute increase in F1 score on an oncology matching dataset. Extensive qualitative assessments were conducted, and the benefits and potential shortcomings of using large language models within the biomedical domain were discussed. were discussed.

View on arXiv PDF

Similar