When did you become so smart, oh wise one?! Sarcasm Explanation in Multi-modal Multi-party Dialogues
This work addresses the problem of AI agents understanding sarcasm in conversations, which is incremental as it builds on sarcasm detection by adding explanation generation.
The paper tackles the challenge of explaining sarcasm in multimodal, multi-party dialogues by introducing the Sarcasm Explanation in Dialogue (SED) task and curating the WITS dataset. It proposes the MAF module, which outperforms traditional baselines on most metrics for generating natural language explanations.
Indirect speech such as sarcasm achieves a constellation of discourse goals in human communication. While the indirectness of figurative language warrants speakers to achieve certain pragmatic goals, it is challenging for AI agents to comprehend such idiosyncrasies of human communication. Though sarcasm identification has been a well-explored topic in dialogue analysis, for conversational systems to truly grasp a conversation's innate meaning and generate appropriate responses, simply detecting sarcasm is not enough; it is vital to explain its underlying sarcastic connotation to capture its true essence. In this work, we study the discourse structure of sarcastic conversations and propose a novel task - Sarcasm Explanation in Dialogue (SED). Set in a multimodal and code-mixed setting, the task aims to generate natural language explanations of satirical conversations. To this end, we curate WITS, a new dataset to support our task. We propose MAF (Modality Aware Fusion), a multimodal context-aware attention and global information fusion module to capture multimodality and use it to benchmark WITS. The proposed attention module surpasses the traditional multimodal fusion baselines and reports the best performance on almost all metrics. Lastly, we carry out detailed analyses both quantitatively and qualitatively.