Text Augmentations with R-drop for Classification of Tweets Self Reporting Covid-19
This work addresses a domain-specific health monitoring problem on social media, with incremental improvements over baseline methods.
The paper tackled classifying tweets that self-report COVID-19 diagnosis by using a model with textual augmentations and R-drop to reduce overfitting, achieving an F1 score of 0.877 on the test set.
This paper presents models created for the Social Media Mining for Health 2023 shared task. Our team addressed the first task, classifying tweets that self-report Covid-19 diagnosis. Our approach involves a classification model that incorporates diverse textual augmentations and utilizes R-drop to augment data and mitigate overfitting, boosting model efficacy. Our leading model, enhanced with R-drop and augmentations like synonym substitution, reserved words, and back translations, outperforms the task mean and median scores. Our system achieves an impressive F1 score of 0.877 on the test set.