Confidence penalty, annealing Gaussian noise and zoneout for biLSTM-CRF networks for named entity recognition
This work improves named entity recognition accuracy for Spanish text processing, but it is incremental as it builds on an existing state-of-the-art architecture.
The paper tackled the problem of optimizing biLSTM-CRF networks for named entity recognition by analyzing methods to avoid overfitting, such as parameter space exploration and regularization, resulting in a new state-of-the-art F1 score of 87.18 on the CoNLL-2003 Spanish dataset.
Named entity recognition (NER) is used to identify relevant entities in text. A bidirectional LSTM (long short term memory) encoder with a neural conditional random fields (CRF) decoder (biLSTM-CRF) is the state of the art methodology. In this work, we have done an analysis of several methods that intend to optimize the performance of networks based on this architecture, which in some cases encourage overfitting avoidance. These methods target exploration of parameter space, regularization of LSTMs and penalization of confident output distributions. Results show that the optimization methods improve the performance of the biLSTM-CRF NER baseline system, setting a new state of the art performance for the CoNLL-2003 Spanish set with an F1 of 87.18.