CLAIOct 10, 2025

A Unified Biomedical Named Entity Recognition Framework with Large Language Models

arXiv:2510.08902v12 citationsh-index: 10Has CodeBIBM
Originality Incremental advance
AI Analysis

This addresses nested entities and cross-lingual generalization in biomedical text, offering an incremental improvement for medical information extraction.

The paper tackles biomedical named entity recognition by proposing a unified framework using large language models, achieving state-of-the-art performance and robust zero-shot generalization across languages on benchmark datasets.

Accurate recognition of biomedical named entities is critical for medical information extraction and knowledge discovery. However, existing methods often struggle with nested entities, entity boundary ambiguity, and cross-lingual generalization. In this paper, we propose a unified Biomedical Named Entity Recognition (BioNER) framework based on Large Language Models (LLMs). We first reformulate BioNER as a text generation task and design a symbolic tagging strategy to jointly handle both flat and nested entities with explicit boundary annotation. To enhance multilingual and multi-task generalization, we perform bilingual joint fine-tuning across multiple Chinese and English datasets. Additionally, we introduce a contrastive learning-based entity selector that filters incorrect or spurious predictions by leveraging boundary-sensitive positive and negative samples. Experimental results on four benchmark datasets and two unseen corpora show that our method achieves state-of-the-art performance and robust zero-shot generalization across languages. The source codes are freely available at https://github.com/dreamer-tx/LLMNER.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes