CVAINCMar 5, 2024

Simplicity in Complexity : Explaining Visual Complexity using Deep Segmentation Models

arXiv:2403.03134v3h-index: 6CogSci
Originality Incremental advance
AI Analysis

This work addresses the challenge of understanding visual complexity for cognitive science and AI applications, offering an interpretable and generalizable model that improves over previous complex or dataset-specific methods.

The paper tackled the problem of modeling visual complexity by proposing a segment-based approach using SAM and FC-CLIP to quantify segments and classes, finding that a simple linear model with these features effectively explains complexity across six diverse image datasets.

The complexity of visual stimuli plays an important role in many cognitive phenomena, including attention, engagement, memorability, time perception and aesthetic evaluation. Despite its importance, complexity is poorly understood and ironically, previous models of image complexity have been quite complex. There have been many attempts to find handcrafted features that explain complexity, but these features are usually dataset specific, and hence fail to generalise. On the other hand, more recent work has employed deep neural networks to predict complexity, but these models remain difficult to interpret, and do not guide a theoretical understanding of the problem. Here we propose to model complexity using segment-based representations of images. We use state-of-the-art segmentation models, SAM and FC-CLIP, to quantify the number of segments at multiple granularities, and the number of classes in an image respectively. We find that complexity is well-explained by a simple linear model with these two features across six diverse image-sets of naturalistic scene and art images. This suggests that the complexity of images can be surprisingly simple.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes