CL AIJun 11, 2018

Automatic Target Recovery for Hindi-English Code Mixed Puns

Srishti Aggarwal, Kritik Mathur, Radhika Mamidi

arXiv:1806.04535v10.2

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of processing humor in code-mixed language for applications like social media and advertising, but it is incremental as it focuses on a specific subtype and category.

The paper tackled the problem of automatically identifying pun locations and recovering targets in Hindi-English code-mixed puns, achieving a 67% success rate in target recovery on a small dataset of advertisements.

In order for our computer systems to be more human-like, with a higher emotional quotient, they need to be able to process and understand intrinsic human language phenomena like humour. In this paper, we consider a subtype of humour - puns, which are a common type of wordplay-based jokes. In particular, we consider code-mixed puns which have become increasingly mainstream on social media, in informal conversations and advertisements and aim to build a system which can automatically identify the pun location and recover the target of such puns. We first study and classify code-mixed puns into two categories namely intra-sentential and intra-word, and then propose a four-step algorithm to recover the pun targets for puns belonging to the intra-sentential category. Our algorithm uses language models, and phonetic similarity-based features to get the desired results. We test our approach on a small set of code-mixed punning advertisements, and observe that our system is successfully able to recover the targets for 67% of the puns.

View on arXiv PDF

Similar