MMCLCVDec 15, 2022

Ring That Bell: A Corpus and Method for Multimodal Metaphor Detection in Videos

arXiv:2301.01134v1291 citationsh-index: 11
Originality Incremental advance
AI Analysis

This work addresses the challenge of multimodal metaphor detection for researchers in computational linguistics and AI, though it is incremental as the method relies primarily on text and does not fully leverage multimodal cues.

The authors tackled the problem of detecting metaphors in videos by creating the first openly available multimodal metaphor annotated corpus and developing a text-based detection method, which achieved an F1-score of 62% for metaphorical labels.

We present the first openly available multimodal metaphor annotated corpus. The corpus consists of videos including audio and subtitles that have been annotated by experts. Furthermore, we present a method for detecting metaphors in the new dataset based on the textual content of the videos. The method achieves a high F1-score (62\%) for metaphorical labels. We also experiment with other modalities and multimodal methods; however, these methods did not out-perform the text-based model. In our error analysis, we do identify that there are cases where video could help in disambiguating metaphors, however, the visual cues are too subtle for our model to capture. The data is available on Zenodo.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes