The COVMis-Stance dataset: Stance Detection on Twitter for COVID-19 Misinformation
This work addresses stance detection for COVID-19 misinformation on social media, which is an incremental contribution to domain-specific NLP tasks.
The authors tackled the problem of detecting Twitter users' stances towards COVID-19 misinformation by constructing a new dataset of 2631 annotated tweets and fine-tuning models using existing datasets, achieving the best performance with sequential fine-tuning on MNLI and combined undersampled datasets.
During the COVID-19 pandemic, large amounts of COVID-19 misinformation are spreading on social media. We are interested in the stance of Twitter users towards COVID-19 misinformation. However, due to the relative recent nature of the pandemic, only a few stance detection datasets fit our task. We have constructed a new stance dataset consisting of 2631 tweets annotated with the stance towards COVID-19 misinformation. In contexts with limited labeled data, we fine-tune our models by leveraging the MNLI dataset and two existing stance detection datasets (RumourEval and COVIDLies), and evaluate the model performance on our dataset. Our experimental results show that the model performs the best when fine-tuned sequentially on the MNLI dataset and the combination of the undersampled RumourEval and COVIDLies datasets. Our code and dataset are publicly available at https://github.com/yanfangh/covid-rumor-stance