CLMay 1, 2020

Will-They-Won't-They: A Very Large Dataset for Stance Detection on Twitter

Costanza Conforti, Jakob Berndt, Mohammad Taher Pilehvar, Chryssi Giannitsarou, Flavio Toxvaerd, Nigel Collier

arXiv:2005.00388v131.41004 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This provides a new benchmark for researchers in stance detection, though it is incremental as it focuses on dataset creation rather than method innovation.

The authors tackled the problem of stance detection on Twitter by creating a large, high-quality dataset called Will-They-Won't-They (WT-WT) with 51,284 expert-annotated tweets, and found that it poses a strong challenge to existing state-of-the-art models.

We present a new challenging stance detection dataset, called Will-They-Won't-They (WT-WT), which contains 51,284 tweets in English, making it by far the largest available dataset of the type. All the annotations are carried out by experts; therefore, the dataset constitutes a high-quality and reliable benchmark for future research in stance detection. Our experiments with a wide range of recent state-of-the-art stance detection systems show that the dataset poses a strong challenge to existing models in this domain.

View on arXiv PDF Code

Similar