CL AI LGSep 8, 2022

5q032e@SMM4H'22: Transformer-based classification of premise in tweets related to COVID-19

arXiv:2209.03851v20.32 citationsh-index: 3

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of automating stance mining from social media data during the COVID-19 pandemic, but it is incremental as it applies existing transformer methods to a specific dataset.

The authors tackled the problem of classifying the presence of premise in tweets related to COVID-19 using a transformer-based model, achieving a ROC AUC of 0.807 and an F1 score of 0.7648 with RoBERTa.

Automation of social network data assessment is one of the classic challenges of natural language processing. During the COVID-19 pandemic, mining people's stances from public messages have become crucial regarding understanding attitudes towards health orders. In this paper, the authors propose the predictive model based on transformer architecture to classify the presence of premise in Twitter texts. This work is completed as part of the Social Media Mining for Health (SMM4H) Workshop 2022. We explored modern transformer-based classifiers in order to construct the pipeline efficiently capturing tweets semantics. Our experiments on a Twitter dataset showed that RoBERTa is superior to the other transformer models in the case of the premise prediction task. The model achieved competitive performance with respect to ROC AUC value 0.807, and 0.7648 for the F1 score.

View on arXiv PDF

Similar