CL IR LGDec 14, 2019

SemEval-2013 Task 2: Sentiment Analysis in Twitter

Preslav Nakov, Zornitsa Kozareva, Alan Ritter, Sara Rosenthal, Veselin Stoyanov, Theresa Wilson

arXiv:1912.06806v131.61134 citations

Originality Synthesis-oriented

AI Analysis

This work addresses a dataset bottleneck for researchers in sentiment analysis, though it is incremental as it builds on existing evaluation frameworks.

The authors tackled the lack of suitable datasets for sentiment analysis in social media by proposing SemEval-2013 Task 2, which included two subtasks and used crowdsourcing to label Twitter and SMS data, resulting in best F1 scores of 88.9% and 69% for the subtasks.

In recent years, sentiment analysis in social media has attracted a lot of research interest and has been used for a number of applications. Unfortunately, research has been hindered by the lack of suitable datasets, complicating the comparison between approaches. To address this issue, we have proposed SemEval-2013 Task 2: Sentiment Analysis in Twitter, which included two subtasks: A, an expression-level subtask, and B, a message-level subtask. We used crowdsourcing on Amazon Mechanical Turk to label a large Twitter training dataset along with additional test sets of Twitter and SMS messages for both subtasks. All datasets used in the evaluation are released to the research community. The task attracted significant interest and a total of 149 submissions from 44 teams. The best-performing team achieved an F1 of 88.9% and 69% for subtasks A and B, respectively.

View on arXiv PDF

Similar