CLLGDec 9, 2021

Combining Textual Features for the Detection of Hateful and Offensive Language

arXiv:2112.04803v15 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of cyberbullying exposure for social media users, but it appears incremental as it focuses on feature combination within existing neural network architectures.

The paper tackles the detection of hateful and offensive language on Twitter by analyzing the combination of textual features like contextual word embeddings, character-level embeddings, and hate term encodings, achieving evaluation on the HASOC-2021 dataset.

The detection of offensive, hateful and profane language has become a critical challenge since many users in social networks are exposed to cyberbullying activities on a daily basis. In this paper, we present an analysis of combining different textual features for the detection of hateful or offensive posts on Twitter. We provide a detailed experimental evaluation to understand the impact of each building block in a neural network architecture. The proposed architecture is evaluated on the English Subtask 1A: Identifying Hate, offensive and profane content from the post datasets of HASOC-2021 dataset under the team name TIB-VA. We compared different variants of the contextual word embeddings combined with the character level embeddings and the encoding of collected hate terms.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes