CLJun 10, 2021

Neural Text Classification and Stacked Heterogeneous Embeddings for Named Entity Recognition in SMM4H 2021

arXiv:2106.05823v231.7726 citations

Originality Synthesis-oriented

AI Analysis

This work addresses entity and classification tasks in social media medical text, but it is incremental as it applies existing methods to new datasets.

The paper tackled Named Entity Recognition (NER) and Text Classification tasks in the SMM4H 2021 Shared Task, achieving F1-scores of 0.50 and 0.82 for NER and 0.46 and 0.90 for text classification across specific subtasks.

This paper presents our findings from participating in the SMM4H Shared Task 2021. We addressed Named Entity Recognition (NER) and Text Classification. To address NER we explored BiLSTM-CRF with Stacked Heterogeneous Embeddings and linguistic features. We investigated various machine learning algorithms (logistic regression, Support Vector Machine (SVM) and Neural Networks) to address text classification. Our proposed approaches can be generalized to different languages and we have shown its effectiveness for English and Spanish. Our text classification submissions (team:MIC-NLP) have achieved competitive performance with F1-score of $0.46$ and $0.90$ on ADE Classification (Task 1a) and Profession Classification (Task 7a) respectively. In the case of NER, our submissions scored F1-score of $0.50$ and $0.82$ on ADE Span Detection (Task 1b) and Profession Span detection (Task 7b) respectively.

View on arXiv PDF

Similar