SemiMemes: A Semi-supervised Learning Approach for Multimodal Memes Analysis
This addresses the problem of censoring harmful content on social media for platform moderators, but it is incremental as it builds on existing multimodal techniques.
The research tackled sentiment analysis of memes for content moderation by proposing a multimodal semi-supervised learning approach, which outperformed state-of-the-art models on two datasets, including the Hateful Memes dataset.
The prevalence of memes on social media has created the need to sentiment analyze their underlying meanings for censoring harmful content. Meme censoring systems by machine learning raise the need for a semi-supervised learning solution to take advantage of the large number of unlabeled memes available on the internet and make the annotation process less challenging. Moreover, the approach needs to utilize multimodal data as memes' meanings usually come from both images and texts. This research proposes a multimodal semi-supervised learning approach that outperforms other multimodal semi-supervised learning and supervised learning state-of-the-art models on two datasets, the Multimedia Automatic Misogyny Identification and Hateful Memes dataset. Building on the insights gained from Contrastive Language-Image Pre-training, which is an effective multimodal learning technique, this research introduces SemiMemes, a novel training method that combines auto-encoder and classification task to make use of the resourceful unlabeled data.