Nested Named Entity Recognition via Second-best Sequence Learning and Decoding
This addresses the challenge of identifying nested entities in text, which is important for natural language processing applications like information extraction, but it is an incremental improvement over existing methods for handling nested structures.
The paper tackles the problem of nested named entity recognition, where entity names contain other names, by proposing a method that treats nested entity tag sequences as the second-best path within parent entities and uses an outside-to-inside decoding approach. It achieves F1-scores of 85.82%, 84.34%, and 77.36% on ACE-2004, ACE-2005, and GENIA datasets, outperforming or matching existing methods.
When an entity name contains other names within it, the identification of all combinations of names can become difficult and expensive. We propose a new method to recognize not only outermost named entities but also inner nested ones. We design an objective function for training a neural model that treats the tag sequence for nested entities as the second best path within the span of their parent entity. In addition, we provide the decoding method for inference that extracts entities iteratively from outermost ones to inner ones in an outside-to-inside way. Our method has no additional hyperparameters to the conditional random field based model widely used for flat named entity recognition tasks. Experiments demonstrate that our method performs better than or at least as well as existing methods capable of handling nested entities, achieving the F1-scores of 85.82%, 84.34%, and 77.36% on ACE-2004, ACE-2005, and GENIA datasets, respectively.