CVDec 8, 2021

A Simple and efficient deep Scanpath Prediction

arXiv:2112.04610v12 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the need for efficient scanpath prediction in visual attention modeling, though it is incremental as it applies existing architectures in a simplified manner.

The paper tackled the problem of predicting visual scanpaths (human gaze sequences) by using simple fully convolutional deep learning architectures, achieving competitive results that sometimes surpass previous complex models on two datasets.

Visual scanpath is the sequence of fixation points that the human gaze travels while observing an image, and its prediction helps in modeling the visual attention of an image. To this end several models were proposed in the literature using complex deep learning architectures and frameworks. Here, we explore the efficiency of using common deep learning architectures, in a simple fully convolutional regressive manner. We experiment how well these models can predict the scanpaths on 2 datasets. We compare with other models using different metrics and show competitive results that sometimes surpass previous complex architectures. We also compare the different leveraged backbone architectures based on their performances on the experiment to deduce which ones are the most suitable for the task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes