CL AI IRJun 8, 2022

Improved two-stage hate speech classification for twitter based on Deep Neural Networks

arXiv:2206.04162v10.3h-index: 1

Originality Incremental advance

AI Analysis

This work addresses the problem of automatically detecting hate speech, such as racism or sexism, in social media posts, which has societal and economic impacts, but it is incremental as it enhances an existing method.

The paper tackles hate speech detection in tweets by proposing a two-stage deep neural network model that extends an existing LSTM approach, reporting superior classification quality compared to state-of-the-art methods on a public corpus of 16k tweets.

Hate speech is a form of online harassment that involves the use of abusive language, and it is commonly seen in social media posts. This sort of harassment mainly focuses on specific group characteristics such as religion, gender, ethnicity, etc and it has both societal and economic consequences nowadays. The automatic detection of abusive language in text postings has always been a difficult task, but it is lately receiving much interest from the scientific community. This paper addresses the important problem of discerning hateful content in social media. The model we propose in this work is an extension of an existing approach based on LSTM neural network architectures, which we appropriately enhanced and fine-tuned to detect certain forms of hatred language, such as racism or sexism, in a short text. The most significant enhancement is the conversion to a two-stage scheme consisting of Recurrent Neural Network (RNN) classifiers. The output of all One-vs-Rest (OvR) classifiers from the first stage are combined and used to train the second stage classifier, which finally determines the type of harassment. Our study includes a performance comparison of several proposed alternative methods for the second stage evaluated on a public corpus of 16k tweets, followed by a generalization study on another dataset. The reported results show the superior classification quality of the proposed scheme in the task of hate speech detection as compared to the current state-of-the-art.

View on arXiv PDF

Similar