CLAICVMay 22, 2025

IRONIC: Coherence-Aware Reasoning Chains for Multi-Modal Sarcasm Detection

arXiv:2505.16258v25 citationsh-index: 3Has Code
Originality Incremental advance
AI Analysis

This addresses the problem of interpreting figurative language for AI systems, with incremental improvements in multi-modal reasoning.

The paper tackles the challenge of detecting sarcasm in multi-modal inputs by introducing IRONIC, a framework that uses coherence relations for reasoning, achieving state-of-the-art performance in zero-shot multi-modal sarcasm detection.

Interpreting figurative language such as sarcasm across multi-modal inputs presents unique challenges, often requiring task-specific fine-tuning and extensive reasoning steps. However, current Chain-of-Thought approaches do not efficiently leverage the same cognitive processes that enable humans to identify sarcasm. We present IRONIC, an in-context learning framework that leverages Multi-modal Coherence Relations to analyze referential, analogical and pragmatic image-text linkages. Our experiments show that IRONIC achieves state-of-the-art performance on zero-shot Multi-modal Sarcasm Detection across different baselines. This demonstrates the need for incorporating linguistic and cognitive insights into the design of multi-modal reasoning strategies. Our code is available at: https://github.com/aashish2000/IRONIC

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes