Fine-Tuning Transformers for Identifying Self-Reporting Potential Cases and Symptoms of COVID-19 in Tweets
This work addresses the need for automated detection of COVID-19-related information in social media for public health monitoring, but it is incremental as it applies an existing method to specific tasks.
The paper tackled the problem of classifying tweets for self-reported COVID-19 symptoms and reporting types by fine-tuning Distill-BERT, achieving accurate results as part of the SMM4H shared tasks.
We describe our straight-forward approach for Tasks 5 and 6 of 2021 Social Media Mining for Health Applications (SMM4H) shared tasks. Our system is based on fine-tuning Distill- BERT on each task, as well as first fine-tuning the model on the other task. We explore how much fine-tuning is necessary for accurately classifying tweets as containing self-reported COVID-19 symptoms (Task 5) or whether a tweet related to COVID-19 is self-reporting, non-personal reporting, or a literature/news mention of the virus (Task 6).