Unifying Token and Span Level Supervisions for Few-Shot Sequence Labeling
This addresses data scarcity in sequence labeling for NLP tasks, but it is incremental as it builds on existing metric learning approaches by combining granularities.
The paper tackles the problem of few-shot sequence labeling by unifying token-level and span-level supervisions, proposing a Consistent Dual Adaptive Prototypical (CDAP) network with joint training and a consistent loss. The result is new state-of-the-art performance on three benchmark datasets.
Few-shot sequence labeling aims to identify novel classes based on only a few labeled samples. Existing methods solve the data scarcity problem mainly by designing token-level or span-level labeling models based on metric learning. However, these methods are only trained at a single granularity (i.e., either token level or span level) and have some weaknesses of the corresponding granularity. In this paper, we first unify token and span level supervisions and propose a Consistent Dual Adaptive Prototypical (CDAP) network for few-shot sequence labeling. CDAP contains the token-level and span-level networks, jointly trained at different granularities. To align the outputs of two networks, we further propose a consistent loss to enable them to learn from each other. During the inference phase, we propose a consistent greedy inference algorithm that first adjusts the predicted probability and then greedily selects non-overlapping spans with maximum probability. Extensive experiments show that our model achieves new state-of-the-art results on three benchmark datasets.