CLAISINov 1, 2022

Why Is It Hate Speech? Masked Rationale Prediction for Explainable Hate Speech Detection

arXiv:2211.00243v1582 citationsh-index: 26
Originality Incremental advance
AI Analysis

This addresses the need for more interpretable and less biased hate speech detection models, which is incremental as it builds on existing detection methods.

The paper tackles the problem of improving bias and explainability in hate speech detection by proposing Masked Rationale Prediction (MRP) as an intermediate task, which generally achieves state-of-the-art performance across various metrics.

In a hate speech detection model, we should consider two critical aspects in addition to detection performance-bias and explainability. Hate speech cannot be identified based solely on the presence of specific words: the model should be able to reason like humans and be explainable. To improve the performance concerning the two aspects, we propose Masked Rationale Prediction (MRP) as an intermediate task. MRP is a task to predict the masked human rationales-snippets of a sentence that are grounds for human judgment-by referring to surrounding tokens combined with their unmasked rationales. As the model learns its reasoning ability based on rationales by MRP, it performs hate speech detection robustly in terms of bias and explainability. The proposed method generally achieves state-of-the-art performance in various metrics, demonstrating its effectiveness for hate speech detection.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes