CVOct 10, 2018

Invariance Analysis of Saliency Models versus Human Gaze During Scene Free Viewing

arXiv:1810.04456v11 citations
Originality Incremental advance
AI Analysis

This addresses the problem of real-world image distortions for saliency modeling and computer vision applications, but it is incremental as it builds on existing saliency research by adding a systematic distortion analysis.

The paper systematically investigates how 19 types of image distortions affect human gaze and saliency models, finding that distortions cause observers to look at different locations and drastically hinder model performance, with rotation and shearing causing the maximum drop. It also shows that certain distortions used as data augmentation can improve deep saliency models against distortions if they preserve human gaze, while others degrade performance.

Most of current studies on human gaze and saliency modeling have used high-quality stimuli. In real world, however, captured images undergo various types of distortions during the whole acquisition, transmission, and displaying chain. Some distortion types include motion blur, lighting variations and rotation. Despite few efforts, influences of ubiquitous distortions on visual attention and saliency models have not been systematically investigated. In this paper, we first create a large-scale database including eye movements of 10 observers over 1900 images degraded by 19 types of distortions. Second, by analyzing eye movements and saliency models, we find that: a) observers look at different locations over distorted versus original images, and b) performances of saliency models are drastically hindered over distorted images, with the maximum performance drop belonging to Rotation and Shearing distortions. Finally, we investigate the effectiveness of different distortions when serving as data augmentation transformations. Experimental results verify that some useful data augmentation transformations which preserve human gaze of reference images can improve deep saliency models against distortions, while some invalid transformations which severely change human gaze will degrade the performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes