CLAICYApr 4, 2019

Learning to Decipher Hate Symbols

arXiv:1904.02418v11094 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of detecting covert hate speech for content moderation, though it appears incremental in method development.

The paper tackles the problem of understanding hate symbols in online hate speech by proposing a novel deciphering task, achieving improved generalization to unseen symbols with a new Variational Decipher model.

Existing computational models to understand hate speech typically frame the problem as a simple classification task, bypassing the understanding of hate symbols (e.g., 14 words, kigy) and their secret connotations. In this paper, we propose a novel task of deciphering hate symbols. To do this, we leverage the Urban Dictionary and collected a new, symbol-rich Twitter corpus of hate speech. We investigate neural network latent context models for deciphering hate symbols. More specifically, we study Sequence-to-Sequence models and show how they are able to crack the ciphers based on context. Furthermore, we propose a novel Variational Decipher and show how it can generalize better to unseen hate symbols in a more challenging testing setting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes