CLSIDec 14, 2019

#MeTooMA: Multi-Aspect Annotations of Tweets Related to the MeToo Movement

arXiv:1912.06927v252 citations
Originality Synthesis-oriented
AI Analysis

This dataset provides a resource for researchers in psycholinguistics, socio-linguistics, and computational linguistics to study digital social movements on sensitive issues like sexual harassment, but it is incremental as it applies existing annotation methods to new data.

The paper tackles the need for annotated data on social media discourse by presenting a dataset of 9,973 tweets related to the MeToo movement, manually annotated for five linguistic aspects with high inter-annotator agreement (0.79 to 0.93 k-alpha).

In this paper, we present a dataset containing 9,973 tweets related to the MeToo movement that were manually annotated for five different linguistic aspects: relevance, stance, hate speech, sarcasm, and dialogue acts. We present a detailed account of the data collection and annotation processes. The annotations have a very high inter-annotator agreement (0.79 to 0.93 k-alpha) due to the domain expertise of the annotators and clear annotation instructions. We analyze the data in terms of geographical distribution, label correlations, and keywords. Lastly, we present some potential use cases of this dataset. We expect this dataset would be of great interest to psycholinguists, socio-linguists, and computational linguists to study the discursive space of digitally mobilized social movements on sensitive issues like sexual harassment.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes