CVAIJan 27, 2025

Complexity in Complexity: Understanding Visual Complexity Through Structure, Color, and Surprise

arXiv:2501.15890v32 citationsh-index: 6Has CodeCogSci
Originality Incremental advance
AI Analysis

This work addresses the challenge of accurately and interpretably modeling visual complexity for researchers in visual cognition, though it is incremental as it builds on prior interpretable methods.

The paper tackled the problem of modeling human perception of visual complexity by identifying limitations in a previous interpretable model, and it proposed new features (Multi-Scale Sobel Gradient, Multi-Scale Unique Color, and surprise scores) that improved predictive performance on benchmarks and a novel dataset.

Understanding how humans perceive visual complexity is a key area of study in visual cognition. Previous approaches to modeling visual complexity assessments have often resulted in intricate, difficult-to-interpret algorithms that employ numerous features or sophisticated deep learning architectures. While these complex models achieve high performance on specific datasets, they often sacrifice interpretability, making it challenging to understand the factors driving human perception of complexity. Recently (Shen, et al. 2024) proposed an interpretable segmentation-based model that accurately predicted complexity across various datasets, supporting the idea that complexity can be explained simply. In this work, we investigate the failure of their model to capture structural, color and surprisal contributions to complexity. To this end, we propose Multi-Scale Sobel Gradient (MSG) which measures spatial intensity variations, Multi-Scale Unique Color (MUC) which quantifies colorfulness across multiple scales, and surprise scores generated using a Large Language Model. We test our features on existing benchmarks and a novel dataset (Surprising Visual Genome) containing surprising images from Visual Genome. Our experiments demonstrate that modeling complexity accurately is not as simple as previously thought, requiring additional perceptual and semantic factors to address dataset biases. Our model improves predictive performance while maintaining interpretability, offering deeper insights into how visual complexity is perceived and assessed. Our code, analysis and data are available at https://github.com/Complexity-Project/Complexity-in-Complexity.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes