CVMay 14, 2015

CAT2000: A Large Scale Fixation Dataset for Boosting Saliency Research

arXiv:1505.03581v1312 citations
Originality Synthesis-oriented
AI Analysis

This dataset addresses the problem of biased, small-scale data in saliency research for computer vision, though it is incremental as it primarily provides a new resource rather than a novel method.

The authors tackled the risk of overfitting in saliency models by creating CAT2000, a large-scale dataset of 4000 images with eye-tracking data from 120 observers, covering 20 diverse categories to better gauge model progress.

Saliency modeling has been an active research area in computer vision for about two decades. Existing state of the art models perform very well in predicting where people look in natural scenes. There is, however, the risk that these models may have been overfitting themselves to available small scale biased datasets, thus trapping the progress in a local minimum. To gain a deeper insight regarding current issues in saliency modeling and to better gauge progress, we recorded eye movements of 120 observers while they freely viewed a large number of naturalistic and artificial images. Our stimuli includes 4000 images; 200 from each of 20 categories covering different types of scenes such as Cartoons, Art, Objects, Low resolution images, Indoor, Outdoor, Jumbled, Random, and Line drawings. We analyze some basic properties of this dataset and compare some successful models. We believe that our dataset opens new challenges for the next generation of saliency models and helps conduct behavioral studies on bottom-up visual attention.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes