CV LGSep 11, 2023

Divergences in Color Perception between Deep Neural Networks and Humans

Ethan O. Nadler, Elise Darragh-Ford, Bhargav Srinivasa Desikan, Christian Conaway, Mark Chu, Tasker Hull, Douglas Guilbeault

arXiv:2309.05809v15.912 citationsh-index: 29Has Code

Originality Incremental advance

AI Analysis

This work addresses the gap in understanding how well DNNs model human vision, specifically color perception, which is important for researchers in machine learning and cognitive science, though it is incremental as it builds on existing theories and methods.

The study tackled the problem of whether deep neural networks (DNNs) capture human color perception by developing experiments to evaluate color embeddings and comparing them to human judgments. It found that state-of-the-art DNNs diverge from human color similarity judgments, while a wavelet-based model performed better, with the wavelet algorithm providing more coherent embeddings that better predict human judgments.

Deep neural networks (DNNs) are increasingly proposed as models of human vision, bolstered by their impressive performance on image classification and object recognition tasks. Yet, the extent to which DNNs capture fundamental aspects of human vision such as color perception remains unclear. Here, we develop novel experiments for evaluating the perceptual coherence of color embeddings in DNNs, and we assess how well these algorithms predict human color similarity judgments collected via an online survey. We find that state-of-the-art DNN architectures $-$ including convolutional neural networks and vision transformers $-$ provide color similarity judgments that strikingly diverge from human color judgments of (i) images with controlled color properties, (ii) images generated from online searches, and (iii) real-world images from the canonical CIFAR-10 dataset. We compare DNN performance against an interpretable and cognitively plausible model of color perception based on wavelet decomposition, inspired by foundational theories in computational neuroscience. While one deep learning model $-$ a convolutional DNN trained on a style transfer task $-$ captures some aspects of human color perception, our wavelet algorithm provides more coherent color embeddings that better predict human color judgments compared to all DNNs we examine. These results hold when altering the high-level visual task used to train similar DNN architectures (e.g., image classification versus image segmentation), as well as when examining the color embeddings of different layers in a given DNN architecture. These findings break new ground in the effort to analyze the perceptual representations of machine learning algorithms and to improve their ability to serve as cognitively plausible models of human vision. Implications for machine learning, human perception, and embodied cognition are discussed.

View on arXiv PDF Code

Similar