CL AINov 20, 2022

Explaining (Sarcastic) Utterances to Enhance Affect Understanding in Multimodal Dialogues

Shivani Kumar, Ishani Mondal, Md Shad Akhtar, Tanmoy Chakraborty

arXiv:2211.11049v22.122 citationsh-index: 41Has Code

Originality Incremental advance

AI Analysis

This work addresses the problem of affect understanding in conversational AI for applications like chatbots and social media analysis, though it is incremental as it builds on existing sarcasm explanation methods.

The paper tackles the challenge of understanding figurative language like sarcasm in multimodal dialogues by proposing MOSES, a deep neural network that generates natural language explanations for sarcastic utterances, which improves sarcasm detection by ~14% F1-score and other affect classification tasks by ~2%.

Conversations emerge as the primary media for exchanging ideas and conceptions. From the listener's perspective, identifying various affective qualities, such as sarcasm, humour, and emotions, is paramount for comprehending the true connotation of the emitted utterance. However, one of the major hurdles faced in learning these affect dimensions is the presence of figurative language, viz. irony, metaphor, or sarcasm. We hypothesize that any detection system constituting the exhaustive and explicit presentation of the emitted utterance would improve the overall comprehension of the dialogue. To this end, we explore the task of Sarcasm Explanation in Dialogues, which aims to unfold the hidden irony behind sarcastic utterances. We propose MOSES, a deep neural network, which takes a multimodal (sarcastic) dialogue instance as an input and generates a natural language sentence as its explanation. Subsequently, we leverage the generated explanation for various natural language understanding tasks in a conversational dialogue setup, such as sarcasm detection, humour identification, and emotion recognition. Our evaluation shows that MOSES outperforms the state-of-the-art system for SED by an average of ~2% on different evaluation metrics, such as ROUGE, BLEU, and METEOR. Further, we observe that leveraging the generated explanation advances three downstream tasks for affect classification - an average improvement of ~14% F1-score in the sarcasm detection task and ~2% in the humour identification and emotion recognition task. We also perform extensive analyses to assess the quality of the results.

View on arXiv PDF Code

Similar