CLNov 30, 2021

Automatic Extraction of Medication Names in Tweets as Named Entity Recognition

arXiv:2111.15641v1
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of mining health-related information from social media for applications in public health and medical research, but it is incremental as it applies existing methods to a specific dataset.

The researchers tackled the problem of automatically extracting medication names from tweets as a named entity recognition task, achieving a strict F1 score of 0.764 on unseen test data using an ensemble of fine-tuned BERT-style models.

Social media posts contain potentially valuable information about medical conditions and health-related behavior. Biocreative VII Task 3 focuses on mining this information by recognizing mentions of medications and dietary supplements in tweets. We approach this task by fine tuning multiple BERT-style language models to perform token-level classification, and combining them into ensembles to generate final predictions. Our best system consists of five Megatron-BERT-345M models and achieves a strict F1 score of 0.764 on unseen test data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes