CVAIApr 29, 2025

MemeBLIP2: A novel lightweight multimodal system to detect harmful memes

arXiv:2504.21226v310 citationsh-index: 5
Originality Incremental advance
AI Analysis

This work addresses the detection of harmful content in memes, which is an incremental advancement for social media moderation and safety applications.

The paper tackles the problem of detecting harmful memes by introducing MemeBLIP2, a lightweight multimodal system that combines image and text features, and reports improved detection capabilities on the PrideMM datasets, capturing subtle cues even in ironic or culturally specific content.

Memes often merge visuals with brief text to share humor or opinions, yet some memes contain harmful messages such as hate speech. In this paper, we introduces MemeBLIP2, a light weight multimodal system that detects harmful memes by combining image and text features effectively. We build on previous studies by adding modules that align image and text representations into a shared space and fuse them for better classification. Using BLIP-2 as the core vision-language model, our system is evaluated on the PrideMM datasets. The results show that MemeBLIP2 can capture subtle cues in both modalities, even in cases with ironic or culturally specific content, thereby improving the detection of harmful material.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes