CL CY MMMay 25, 2023

MEMEX: Detecting Explanatory Evidence for Memes via Knowledge-Enriched Contextualization

Shivam Sharma, Ramaneswaran S, Udit Arora, Md. Shad Akhtar, Tanmoy Chakraborty

arXiv:2305.15913v226.6228 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the challenge of dynamically understanding memes' subtle messages for social media users and researchers, though it is incremental as it builds on existing multimodal methods.

The paper tackles the problem of automatically explaining memes by mining contextual background from related documents, proposing a new task called MEMEX and creating a novel dataset (MCC) for it. The result is a multimodal neural framework (MIME) that achieves an absolute improvement of ~4% F1-score over the best baseline.

Memes are a powerful tool for communication over social media. Their affinity for evolving across politics, history, and sociocultural phenomena makes them an ideal communication vehicle. To comprehend the subtle message conveyed within a meme, one must understand the background that facilitates its holistic assimilation. Besides digital archiving of memes and their metadata by a few websites like knowyourmeme.com, currently, there is no efficient way to deduce a meme's context dynamically. In this work, we propose a novel task, MEMEX - given a meme and a related document, the aim is to mine the context that succinctly explains the background of the meme. At first, we develop MCC (Meme Context Corpus), a novel dataset for MEMEX. Further, to benchmark MCC, we propose MIME (MultImodal Meme Explainer), a multimodal neural framework that uses common sense enriched meme representation and a layered approach to capture the cross-modal semantic dependencies between the meme and the context. MIME surpasses several unimodal and multimodal systems and yields an absolute improvement of ~ 4% F1-score over the best baseline. Lastly, we conduct detailed analyses of MIME's performance, highlighting the aspects that could lead to optimal modeling of cross-modal contextual associations.

View on arXiv PDF Code

Similar