IR CY LG SIJan 9, 2021

Eating Garlic Prevents COVID-19 Infection: Detecting Misinformation on the Arabic Content of Twitter

Sarah Alqurashi, Btool Hamoui, Abdulaziz Alashaikh, Ahmad Alhindi, Eisa Alanazi

arXiv:2101.05626v15.143 citationsHas Code

Originality Incremental advance

AI Analysis

This work is significant for social media users and platforms, as it provides a method for identifying misinformation related to COVID-19 in Arabic content, which is an under-researched area.

This paper addresses the problem of detecting COVID-19 misinformation in Arabic tweets. The authors built a large dataset of Arabic COVID-19 tweets, gold-annotated for misinformation, and applied various machine learning models. XGBoost achieved the highest accuracy in detecting misinformation.

The rapid growth of social media content during the current pandemic provides useful tools for disseminating information which has also become a root for misinformation. Therefore, there is an urgent need for fact-checking and effective techniques for detecting misinformation in social media. In this work, we study the misinformation in the Arabic content of Twitter. We construct a large Arabic dataset related to COVID-19 misinformation and gold-annotate the tweets into two categories: misinformation or not. Then, we apply eight different traditional and deep machine learning models, with different features including word embeddings and word frequency. The word embedding models (\textsc{FastText} and word2vec) exploit more than two million Arabic tweets related to COVID-19. Experiments show that optimizing the area under the curve (AUC) improves the models' performance and the Extreme Gradient Boosting (XGBoost) presents the highest accuracy in detecting COVID-19 misinformation online.

View on arXiv PDF Code

Similar