CLJan 27, 2017

Measuring the Reliability of Hate Speech Annotations: The Case of the European Refugee Crisis

arXiv:1701.08118v1432 citations
Originality Synthesis-oriented
AI Analysis

This addresses the challenge of inconsistent annotations for training hate speech detection systems, which is crucial for social media moderation, but the findings are incremental as they highlight existing reliability issues without a new solution.

The study investigated the reliability of hate speech annotations by having two groups of internet users rate potentially hateful messages, with one group shown a definition beforehand, and found that providing a definition partially aligned opinions but did not improve reliability, which remained very low overall.

Some users of social media are spreading racist, sexist, and otherwise hateful content. For the purpose of training a hate speech detection system, the reliability of the annotations is crucial, but there is no universally agreed-upon definition. We collected potentially hateful messages and asked two groups of internet users to determine whether they were hate speech or not, whether they should be banned or not and to rate their degree of offensiveness. One of the groups was shown a definition prior to completing the survey. We aimed to assess whether hate speech can be annotated reliably, and the extent to which existing definitions are in accordance with subjective ratings. Our results indicate that showing users a definition caused them to partially align their own opinion with the definition but did not improve reliability, which was very low overall. We conclude that the presence of hate speech should perhaps not be considered a binary yes-or-no decision, and raters need more detailed instructions for the annotation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes