CLJan 31, 2025

Improving Low-Resource Sequence Labeling with Knowledge Fusion and Contextual Label Explanations

Peichao Lai, Jiaxin Gan, Feiyang Ye, Yilei Wang, Bin Cui

arXiv:2501.19093v44.91 citationsh-index: 2EMNLP

Originality Incremental advance

AI Analysis

This addresses the problem of inadequate model applicability and semantic biases in low-resource, domain-specific sequence labeling, offering an incremental improvement over existing methods.

The paper tackled low-resource sequence labeling in domain-specific contexts, particularly for Chinese, by proposing a framework combining LLM-based knowledge enhancement and a span-based model, achieving state-of-the-art performance on multiple datasets.

Sequence labeling remains a significant challenge in low-resource, domain-specific scenarios, particularly for character-dense languages like Chinese. Existing methods primarily focus on enhancing model comprehension and improving data diversity to boost performance. However, these approaches still struggle with inadequate model applicability and semantic distribution biases in domain-specific contexts. To overcome these limitations, we propose a novel framework that combines an LLM-based knowledge enhancement workflow with a span-based Knowledge Fusion for Rich and Efficient Extraction (KnowFREE) model. Our workflow employs explanation prompts to generate precise contextual interpretations of target entities, effectively mitigating semantic biases and enriching the model's contextual understanding. The KnowFREE model further integrates extension label features, enabling efficient nested entity extraction without relying on external knowledge during inference. Experiments on multiple Chinese domain-specific sequence labeling datasets demonstrate that our approach achieves state-of-the-art performance, effectively addressing the challenges posed by low-resource settings.

View on arXiv PDF

Similar