Fighting an Infodemic: COVID-19 Fake News Dataset
This addresses the spread of misinformation during the COVID-19 pandemic, but it is incremental as it applies existing methods to a new dataset.
The authors tackled the problem of COVID-19 fake news by curating a manually annotated dataset of 10,700 social media posts and articles, achieving a best performance of 93.46% F1-score with SVM.
Along with COVID-19 pandemic we are also fighting an `infodemic'. Fake news and rumors are rampant on social media. Believing in rumors can cause significant harm. This is further exacerbated at the time of a pandemic. To tackle this, we curate and release a manually annotated dataset of 10,700 social media posts and articles of real and fake news on COVID-19. We benchmark the annotated dataset with four machine learning baselines - Decision Tree, Logistic Regression, Gradient Boost, and Support Vector Machine (SVM). We obtain the best performance of 93.46% F1-score with SVM. The data and code is available at: https://github.com/parthpatwa/covid19-fake-news-dectection