CLFeb 28

Learning Nested Named Entity Recognition from Flat Annotations

arXiv:2603.00840v10.6h-index: 20Has Code

Originality Incremental advance

AI Analysis

This addresses the scarcity of nested NER resources for researchers and practitioners by enabling models to infer nested structures from cheaper flat data, though it is incremental as it builds on existing methods.

The paper tackled the problem of learning nested named entity recognition from flat annotations, which are more abundant than expensive nested ones, and achieved a 26.37% inner F1 score on a Russian benchmark, closing 40% of the gap to full nested supervision.

Nested named entity recognition identifies entities contained within other entities, but requires expensive multi-level annotation. While flat NER corpora exist abundantly, nested resources remain scarce. We investigate whether models can learn nested structure from flat annotations alone, evaluating four approaches: string inclusions (substring matching), entity corruption (pseudo-nested data), flat neutralization (reducing false negative signal), and a hybrid fine-tuned + LLM pipeline. On NEREL, a Russian benchmark with 29 entity types where 21% of entities are nested, our best combined method achieves 26.37% inner F1, closing 40% of the gap to full nested supervision. Code is available at https://github.com/fulstock/Learning-from-Flat-Annotations.

View on arXiv PDF Code

Similar