CVNCAPSep 26, 2014

How close are we to understanding image-based saliency?

arXiv:1409.7686v15 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of accurately modeling spatial saliency for computer vision and neuroscience researchers, showing it is more difficult than previously thought.

The authors tackled the problem of evaluating how well saliency models explain human gaze fixations by framing them probabilistically as point processes, and found that state-of-the-art models capture only one third of the explainable spatial information.

Within the set of the many complex factors driving gaze placement, the properities of an image that are associated with fixations under free viewing conditions have been studied extensively. There is a general impression that the field is close to understanding this particular association. Here we frame saliency models probabilistically as point processes, allowing the calculation of log-likelihoods and bringing saliency evaluation into the domain of information. We compared the information gain of state-of-the-art models to a gold standard and find that only one third of the explainable spatial information is captured. We additionally provide a principled method to show where and how models fail to capture information in the fixations. Thus, contrary to previous assertions, purely spatial saliency remains a significant challenge.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes