CLOct 5, 2020

PublishInCovid19 at WNUT 2020 Shared Task-1: Entity Recognition in Wet Lab Protocols using Structured Learning Ensemble and Contextualised Embeddings

arXiv:2010.02142v2994 citations

Originality Synthesis-oriented

AI Analysis

This work addresses entity extraction for scientific protocols, but it is incremental as it applies existing methods to a new dataset.

The paper tackled entity recognition in wet lab protocols by using an ensemble of BiLSTM-CRF models with contextualized embeddings, achieving a micro F1-score of 0.8175 for partial match and 0.7757 for exact match, ranking first and second in a shared task.

In this paper, we describe the approach that we employed to address the task of Entity Recognition over Wet Lab Protocols -- a shared task in EMNLP WNUT-2020 Workshop. Our approach is composed of two phases. In the first phase, we experiment with various contextualised word embeddings (like Flair, BERT-based) and a BiLSTM-CRF model to arrive at the best-performing architecture. In the second phase, we create an ensemble composed of eleven BiLSTM-CRF models. The individual models are trained on random train-validation splits of the complete dataset. Here, we also experiment with different output merging schemes, including Majority Voting and Structured Learning Ensembling (SLE). Our final submission achieved a micro F1-score of 0.8175 and 0.7757 for the partial and exact match of the entity spans, respectively. We were ranked first and second, in terms of partial and exact match, respectively.

View on arXiv PDF

Similar