Improving Automatic Hate Speech Detection with Multiword Expression Features
This work addresses the problem of automating hate speech detection for social media platforms, but it is incremental as it builds on existing neural network frameworks with new features.
The authors tackled hate speech detection in social media by proposing multiword expression features integrated into a deep neural network, resulting in significant performance improvements over a baseline system in terms of macro-F1 on two tweet corpora.
The task of automatically detecting hate speech in social media is gaining more and more attention. Given the enormous volume of content posted daily, human monitoring of hate speech is unfeasible. In this work, we propose new word-level features for automatic hate speech detection (HSD): multiword expressions (MWEs). MWEs are lexical units greater than a word that have idiomatic and compositional meanings. We propose to integrate MWE features in a deep neural network-based HSD framework. Our baseline HSD system relies on Universal Sentence Encoder (USE). To incorporate MWE features, we create a three-branch deep neural network: one branch for USE, one for MWE categories, and one for MWE embeddings. We conduct experiments on two hate speech tweet corpora with different MWE categories and with two types of MWE embeddings, word2vec and BERT. Our experiments demonstrate that the proposed HSD system with MWE features significantly outperforms the baseline system in terms of macro-F1.