CV AINov 25, 2020

Multimodal Learning for Hateful Memes Detection

arXiv:2011.12870v316.680 citationsh-index: 15Has Code

Originality Incremental advance

AI Analysis

This work is significant for social media platforms and users to automatically identify and mitigate the spread of harmful content.

This paper addresses the problem of detecting hateful memes, where images and text are often weakly aligned, by proposing a method that integrates image captioning into the detection process. The model achieved promising results on the Hateful Memes Detection Challenge.

Memes are used for spreading ideas through social networks. Although most memes are created for humor, some memes become hateful under the combination of pictures and text. Automatically detecting the hateful memes can help reduce their harmful social impact. Unlike the conventional multimodal tasks, where the visual and textual information is semantically aligned, the challenge of hateful memes detection lies in its unique multimodal information. The image and text in memes are weakly aligned or even irrelevant, which requires the model to understand the content and perform reasoning over multiple modalities. In this paper, we focus on multimodal hateful memes detection and propose a novel method that incorporates the image captioning process into the memes detection process. We conduct extensive experiments on multimodal meme datasets and illustrated the effectiveness of our approach. Our model achieves promising results on the Hateful Memes Detection Challenge.

View on arXiv PDF Code

Similar