CLLGMMSIJul 18, 2023

Multi-Modal Discussion Transformer: Integrating Text, Images and Graph Transformers to Detect Hate Speech on Social Media

arXiv:2307.09312v420 citationsh-index: 11
Originality Incremental advance
AI Analysis

This addresses the problem of more accurate hate speech detection for social media platforms, representing an incremental advance by combining existing modalities with graph transformers.

The paper tackles hate speech detection in online social networks by proposing the Multi-Modal Discussion Transformer (mDT), which integrates text, images, and graph transformers to analyze discussions holistically, resulting in improved detection performance as validated on a new dataset, HatefulDiscussions.

We present the Multi-Modal Discussion Transformer (mDT), a novel methodfor detecting hate speech in online social networks such as Reddit discussions. In contrast to traditional comment-only methods, our approach to labelling a comment as hate speech involves a holistic analysis of text and images grounded in the discussion context. This is done by leveraging graph transformers to capture the contextual relationships in the discussion surrounding a comment and grounding the interwoven fusion layers that combine text and image embeddings instead of processing modalities separately. To evaluate our work, we present a new dataset, HatefulDiscussions, comprising complete multi-modal discussions from multiple online communities on Reddit. We compare the performance of our model to baselines that only process individual comments and conduct extensive ablation studies.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes