CLAILGFeb 12, 2025

Dealing with Annotator Disagreement in Hate Speech Classification

arXiv:2502.08266v26 citationsh-index: 6
Originality Incremental advance
AI Analysis

This addresses the challenge of inconsistent labeling for hate speech detection on social media, which is crucial for mitigating harmful content, though it is incremental as it focuses on a specific language and dataset.

The paper tackles the problem of annotator disagreement in hate speech classification by evaluating automatic aggregation methods for Turkish tweets, providing state-of-the-art benchmark results.

Hate speech detection is a crucial task, especially on social media, where harmful content can spread quickly. Implementing machine learning models to automatically identify and address hate speech is essential for mitigating its impact and preventing its proliferation. The first step in developing an effective hate speech detection model is to acquire a high-quality dataset for training. Labeled data is essential for most natural language processing tasks, but categorizing hate speech is difficult due to the diverse and often subjective nature of hate speech, which can lead to varying interpretations and disagreements among annotators. This paper examines strategies for addressing annotator disagreement, an issue that has been largely overlooked. In particular, we evaluate various automatic approaches for aggregating multiple annotations, in the context of hate speech classification in Turkish tweets. Our work highlights the importance of the problem and provides state-of-the-art benchmark results for the detection and understanding of hate speech in online discourse.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes