CLApr 7, 2017

EELECTION at SemEval-2017 Task 10: Ensemble of nEural Learners for kEyphrase ClassificaTION

Steffen Eger, Erik-Lân Do Dinh, Ilia Kuznetsov, Masoud Kiaeeha, Iryna Gurevych

arXiv:1704.02215v26.021 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses a specific NLP task for scientific document analysis, but it is incremental as it builds on existing ensemble and deep learning methods for keyphrase classification.

The paper tackled the problem of classifying keyphrases from scientific publications by exploring three deep learning approaches and creating an ensemble, achieving a micro-F1-score of 0.63 on test data, which ranked 2nd out of four systems, and improving to 0.69 when trained on full data.

This paper describes our approach to the SemEval 2017 Task 10: "Extracting Keyphrases and Relations from Scientific Publications", specifically to Subtask (B): "Classification of identified keyphrases". We explored three different deep learning approaches: a character-level convolutional neural network (CNN), a stacked learner with an MLP meta-classifier, and an attention based Bi-LSTM. From these approaches, we created an ensemble of differently hyper-parameterized systems, achieving a micro-F1-score of 0.63 on the test data. Our approach ranks 2nd (score of 1st placed system: 0.64) out of four according to this official score. However, we erroneously trained 2 out of 3 neural nets (the stacker and the CNN) on only roughly 15% of the full data, namely, the original development set. When trained on the full data (training+development), our ensemble has a micro-F1-score of 0.69. Our code is available from https://github.com/UKPLab/semeval2017-scienceie.

View on arXiv PDF Code

Similar