Meta-Learning Triplet Network with Adaptive Margins for Few-Shot Named Entity Recognition
This work addresses a specific bottleneck in few-shot NER for natural language processing applications, with incremental improvements in method design.
The paper tackles the challenge of representing the miscellaneous 'Other' class in few-shot named entity recognition by proposing MeTNet, which avoids creating a prototype for this class and uses an improved triplet network with adaptive margins for entity types, achieving state-of-the-art results in in-domain and cross-domain experiments.
Meta-learning methods have been widely used in few-shot named entity recognition (NER), especially prototype-based methods. However, the Other(O) class is difficult to be represented by a prototype vector because there are generally a large number of samples in the class that have miscellaneous semantics. To solve the problem, we propose MeTNet, which generates prototype vectors for entity types only but not O-class. We design an improved triplet network to map samples and prototype vectors into a low-dimensional space that is easier to be classified and propose an adaptive margin for each entity type. The margin plays as a radius and controls a region with adaptive size in the low-dimensional space. Based on the regions, we propose a new inference procedure to predict the label of a query instance. We conduct extensive experiments in both in-domain and cross-domain settings to show the superiority of MeTNet over other state-of-the-art methods. In particular, we release a Chinese few-shot NER dataset FEW-COMM extracted from a well-known e-commerce platform. To the best of our knowledge, this is the first Chinese few-shot NER dataset. All the datasets and codes are provided at https://github.com/hccngu/MeTNet.