CVOct 21, 2024

Visual Motif Identification: Elaboration of a Curated Comparative Dataset and Classification Methods

arXiv:2410.15866v1h-index: 21>ECCV Workshops
Originality Incremental advance
AI Analysis

This work addresses the need for automated analysis of artistic elements in visual media, benefiting researchers and filmmakers, but it is incremental as it builds on existing methods like CLIP.

The paper tackled the problem of recognizing and classifying visual motifs in cinema by proposing a machine learning model that uses CLIP features with a shallow network and custom loss, achieving an F1-score of 0.91 on a test set with 20 motifs.

In cinema, visual motifs are recurrent iconographic compositions that carry artistic or aesthetic significance. Their use throughout the history of visual arts and media is interesting to researchers and filmmakers alike. Our goal in this work is to recognise and classify these motifs by proposing a new machine learning model that uses a custom dataset to that end. We show how features extracted from a CLIP model can be leveraged by using a shallow network and an appropriate loss to classify images into 20 different motifs, with surprisingly good results: an $F_1$-score of 0.91 on our test set. We also present several ablation studies justifying the input features, architecture and hyperparameters used.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes