CLNov 30, 2024

Few-Shot Domain Adaptation for Named-Entity Recognition via Joint Constrained k-Means and Subspace Selection

Ayoub Hammal, Benno Uthayasooriyar, Caio Corro

arXiv:2412.00426v211.519 citationsh-index: 1Has CodeCOLING

Originality Incremental advance

AI Analysis

This addresses the problem of limited annotated data for NER across domains, though it appears incremental as it builds on existing k-means and subspace selection methods.

The paper tackles few-shot named-entity recognition by proposing a weakly supervised algorithm that combines small labeled datasets with unlabeled data, achieving state-of-the-art results on several English datasets.

Named-entity recognition (NER) is a task that typically requires large annotated datasets, which limits its applicability across domains with varying entity definitions. This paper addresses few-shot NER, aiming to transfer knowledge to new domains with minimal supervision. Unlike previous approaches that rely solely on limited annotated data, we propose a weakly supervised algorithm that combines small labeled datasets with large amounts of unlabeled data. Our method extends the k-means algorithm with label supervision, cluster size constraints and domain-specific discriminative subspace selection. This unified framework achieves state-of-the-art results in few-shot NER on several English datasets.

View on arXiv PDF Code

Similar