CL LGSep 26, 2019

Coin_flipper at eHealth-KD Challenge 2019: Voting LSTMs for Key Phrases and Semantic Relation Identification Applied to Spanish eHealth Texts

arXiv:1909.12339v10.2

Originality Synthesis-oriented

AI Analysis

This is an incremental improvement for processing Spanish eHealth texts, addressing a domain-specific problem in healthcare NLP.

The paper tackled the eHealth-KD 2019 challenge tasks of key phrase and semantic relation identification in Spanish eHealth texts, achieving second place with an F1 score of 62.18% using a stacked bi-LSTM with a surrogate F1 loss function and ensemble voting.

This paper describes our approach presented for the eHealth-KD 2019 challenge. Our participation was aimed at testing how far we could go using generic tools for Text-Processing but, at the same time, using common optimization techniques in the field of Data Mining. The architecture proposed for both tasks of the challenge is a standard stacked 2-layer bi-LSTM. The main particularities of our approach are: (a) The use of a surrogate function of F1 as loss function to close the gap between the minimization function and the evaluation metric, and (b) The generation of an ensemble of models for generating predictions by majority vote. Our system ranked second with an F1 score of 62.18% in the main task by a narrow margin with the winner that scored 63.94%.

View on arXiv PDF

Similar