Hierarchical CVAE for Fine-Grained Hate Speech Classification
This work addresses the problem of automated hate speech detection for researchers and practitioners by offering a more detailed classification approach, though it is incremental as it builds on existing CVAE methods.
The paper tackles fine-grained hate speech classification by differentiating among 40 hate groups across 13 categories, proposing a hierarchical Conditional Variational Autoencoder (CVAE) that incorporates hate category information to improve performance and outperforms common discriminative models.
Existing work on automated hate speech detection typically focuses on binary classification or on differentiating among a small set of categories. In this paper, we propose a novel method on a fine-grained hate speech classification task, which focuses on differentiating among 40 hate groups of 13 different hate group categories. We first explore the Conditional Variational Autoencoder (CVAE) as a discriminative model and then extend it to a hierarchical architecture to utilize the additional hate category information for more accurate prediction. Experimentally, we show that incorporating the hate category information for training can significantly improve the classification performance and our proposed model outperforms commonly-used discriminative models.