CV SIApr 4, 2022

On Explaining Multimodal Hateful Meme Detection Models

Ming Shan Hee, Roy Ka-Wei Lee, Wen-Haw Chong

arXiv:2204.01734v218.258 citationsh-index: 23

Originality Synthesis-oriented

AI Analysis

This addresses the problem of understanding and improving multimodal hateful meme detection models for researchers and practitioners, but it is incremental as it focuses on explaining existing methods rather than introducing new ones.

The paper investigated what pre-trained visual-linguistic models learn for hateful meme detection, finding that the image modality contributes more to classification and models can perform visual-text slurs grounding to some extent, but also exhibit biases leading to false positives.

Hateful meme detection is a new multimodal task that has gained significant traction in academic and industry research communities. Recently, researchers have applied pre-trained visual-linguistic models to perform the multimodal classification task, and some of these solutions have yielded promising results. However, what these visual-linguistic models learn for the hateful meme classification task remains unclear. For instance, it is unclear if these models are able to capture the derogatory or slurs references in multimodality (i.e., image and text) of the hateful memes. To fill this research gap, this paper propose three research questions to improve our understanding of these visual-linguistic models performing the hateful meme classification task. We found that the image modality contributes more to the hateful meme classification task, and the visual-linguistic models are able to perform visual-text slurs grounding to a certain extent. Our error analysis also shows that the visual-linguistic models have acquired biases, which resulted in false-positive predictions.

View on arXiv PDF

Similar