MIMIC: Multimodal Islamophobic Meme Identification and Classification
This addresses the issue of anti-Muslim hate speech in memes for social media moderation and safety, but it is incremental as it applies an existing method to a new domain-specific dataset.
The paper tackles the problem of identifying Islamophobic hate speech in memes by creating a novel dataset and proposing a classifier based on the Vision-and-Language Transformer (ViLT) that integrates visual and textual representations, achieving high detection accuracy.
Anti-Muslim hate speech has emerged within memes, characterized by context-dependent and rhetorical messages using text and images that seemingly mimic humor but convey Islamophobic sentiments. This work presents a novel dataset and proposes a classifier based on the Vision-and-Language Transformer (ViLT) specifically tailored to identify anti-Muslim hate within memes by integrating both visual and textual representations. Our model leverages joint modal embeddings between meme images and incorporated text to capture nuanced Islamophobic narratives that are unique to meme culture, providing both high detection accuracy and interoperability.