CLJun 28, 2025

Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models

arXiv:2506.22813v16.73 citationsh-index: 5Has CodeACL

Originality Incremental advance

AI Analysis

This work addresses the problem of adaptation and scalability in named entity recognition for researchers and practitioners, offering an incremental improvement over existing methods.

The paper tackles the high cost of annotating fine-grained labels and training domain-specific models for named entity recognition by proposing the SaM framework, which dynamically selects and merges pre-trained expert models at inference time, outperforming unified models by an average of 10% across multiple benchmarks.

Supervised fine-tuning (SFT) is widely used to align large language models (LLMs) with information extraction (IE) tasks, such as named entity recognition (NER). However, annotating such fine-grained labels and training domain-specific models is costly. Existing works typically train a unified model across multiple domains, but such approaches lack adaptation and scalability since not all training data benefits target domains and scaling trained models remains challenging. We propose the SaM framework, which dynamically Selects and Merges expert models at inference time. Specifically, for a target domain, we select domain-specific experts pre-trained on existing domains based on (i) domain similarity to the target domain and (ii) performance on sampled instances, respectively. The experts are then merged to create task-specific models optimized for the target domain. By dynamically merging experts beneficial to target domains, we improve generalization across various domains without extra training. Additionally, experts can be added or removed conveniently, leading to great scalability. Extensive experiments on multiple benchmarks demonstrate our framework's effectiveness, which outperforms the unified model by an average of 10%. We further provide insights into potential improvements, practical experience, and extensions of our framework.

View on arXiv PDF Code

Similar