CL AI LGJan 11, 2022

A Feature Extraction based Model for Hate Speech Identification

Salar Mohtaj, Vera Schmitt, Sebastian Möller

arXiv:2201.04227v10.84 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of detecting harmful online content for marginalized groups, but it is incremental as it applies existing methods to a new dataset from a shared task.

The paper tackled hate speech identification in Indo-European languages by evaluating various NLP models, including recurrent neural networks and BERT-based transfer learning, with transfer learning models achieving the best results in the competition subtasks.

The detection of hate speech online has become an important task, as offensive language such as hurtful, obscene and insulting content can harm marginalized people or groups. This paper presents TU Berlin team experiments and results on the task 1A and 1B of the shared task on hate speech and offensive content identification in Indo-European languages 2021. The success of different Natural Language Processing models is evaluated for the respective subtasks throughout the competition. We tested different models based on recurrent neural networks in word and character levels and transfer learning approaches based on Bert on the provided dataset by the competition. Among the tested models that have been used for the experiments, the transfer learning-based models achieved the best results in both subtasks.

View on arXiv PDF

Similar