CVMar 21, 2025

Meme Similarity and Emotion Detection using Multimodal Analysis

arXiv:2503.17493v13 citationsh-index: 11ABC
Originality Synthesis-oriented
AI Analysis

This research addresses the gap in analyzing the interplay between visual and textual components of memes for online culture and content moderation, though it is incremental as it applies existing models to a new multimodal context.

The study tackled the problem of comparing memes and detecting their emotions by employing a multimodal CLIP model for similarity assessment and a DistilBERT classifier for emotion categorization, achieving 67.23% agreement with human judgments on similarity and identifying anger and joy as dominant emotions.

Internet memes are a central element of online culture, blending images and text. While substantial research has focused on either the visual or textual components of memes, little attention has been given to their interplay. This gap raises a key question: What methodology can effectively compare memes and the emotions they elicit? Our study employs a multimodal methodological approach, analyzing both the visual and textual elements of memes. Specifically, we perform a multimodal CLIP (Contrastive Language-Image Pre-training) model for grouping similar memes based on text and visual content embeddings, enabling robust similarity assessments across modalities. Using the Reddit Meme Dataset and Memotion Dataset, we extract low-level visual features and high-level semantic features to identify similar meme pairs. To validate these automated similarity assessments, we conducted a user study with 50 participants, asking them to provide yes/no responses regarding meme similarity and their emotional reactions. The comparison of experimental results with human judgments showed a 67.23\% agreement, suggesting that the computational approach aligns well with human perception. Additionally, we implemented a text-based classifier using the DistilBERT model to categorize memes into one of six basic emotions. The results indicate that anger and joy are the dominant emotions in memes, with motivational memes eliciting stronger emotional responses. This research contributes to the study of multimodal memes, enhancing both language-based and visual approaches to analyzing and improving online visual communication and user experiences. Furthermore, it provides insights for better content moderation strategies in online platforms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes