IR CLJul 20, 2020

Characterizing drug mentions in COVID-19 Twitter Chatter

arXiv:2007.10276v256.2994 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of analyzing social media data for public health insights, though it is incremental in improving data preprocessing for drug mention detection.

The researchers tackled the problem of identifying drug mentions in COVID-19 Twitter data, recovering nearly 15% additional data by using machine learning alongside traditional methods to handle informal language and misspellings.

Since the classification of COVID-19 as a global pandemic, there have been many attempts to treat and contain the virus. Although there is no specific antiviral treatment recommended for COVID-19, there are several drugs that can potentially help with symptoms. In this work, we mined a large twitter dataset of 424 million tweets of COVID-19 chatter to identify discourse around drug mentions. While seemingly a straightforward task, due to the informal nature of language use in Twitter, we demonstrate the need of machine learning alongside traditional automated methods to aid in this task. By applying these complementary methods, we are able to recover almost 15% additional data, making misspelling handling a needed task as a pre-processing step when dealing with social media data.

View on arXiv PDF

Similar