CV NCMar 29, 2021

CNN-based search model underestimates attention guidance by simple visual features

arXiv:2103.15439v22 citations

Originality Synthesis-oriented

AI Analysis

This work addresses a problem in cognitive science and computer vision by highlighting limitations in using standard CNNs for modeling human attention, but it is incremental as it builds on prior research.

The study found that a CNN-based search model adapted from Zhang et al. (2018) significantly underestimates human attention guidance by simple visual features in search experiments, with no concrete numerical results provided.

Recently, Zhang et al. (2018) proposed an interesting model of attention guidance that uses visual features learnt by convolutional neural networks for object recognition. I adapted this model for search experiments with accuracy as the measure of performance. Simulation of our previously published feature and conjunction search experiments revealed that CNN-based search model considerably underestimates human attention guidance by simple visual features. A simple explanation is that the model has no bottom-up guidance of attention. Another view might be that standard CNNs do not learn features required for human-like attention guidance.

View on arXiv PDF

Similar