CLSIFeb 15, 2025

Evolving Hate Speech Online: An Adaptive Framework for Detection and Mitigation

arXiv:2502.10921v24 citationsh-index: 46
Originality Incremental advance
AI Analysis

This work aims to improve online safety by enhancing hate speech detection for vulnerable communities, though it is incremental as it builds on existing methods like BERT and lexicon-based approaches.

The paper tackles the problem of detecting hate speech online by addressing the limitations of static lexicons with an adaptive framework that updates lexicons using word embeddings and combines BERT with lexicon-based techniques, achieving 95% accuracy on state-of-the-art datasets.

The proliferation of social media platforms has led to an increase in the spread of hate speech, particularly targeting vulnerable communities. Unfortunately, existing methods for automatically identifying and blocking toxic language rely on pre-constructed lexicons, making them reactive rather than adaptive. As such, these approaches become less effective over time, especially when new communities are targeted with slurs not included in the original datasets. To address this issue, we present an adaptive approach that uses word embeddings to update lexicons and develop a hybrid model that adjusts to emerging slurs and new linguistic patterns. This approach can effectively detect toxic language, including intentional spelling mistakes employed by aggressors to avoid detection. Our hybrid model, which combines BERT with lexicon-based techniques, achieves an accuracy of 95% for most state-of-the-art datasets. Our work has significant implications for creating safer online environments by improving the detection of toxic content and proactively updating the lexicon. Content Warning: This paper contains examples of hate speech that may be triggering.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes