CVAISep 3, 2018

PathGAN: Visual Scanpath Prediction with Generative Adversarial Networks

arXiv:1809.00567v193 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of modeling stochastic human gaze patterns for applications in computer vision and human-computer interaction, representing an incremental advance.

The paper tackles the problem of predicting human visual scanpaths on images by introducing PathGAN, a generative adversarial network that improves state-of-the-art performance on the iSUN and Salient360! datasets.

We introduce PathGAN, a deep neural network for visual scanpath prediction trained on adversarial examples. A visual scanpath is defined as the sequence of fixation points over an image defined by a human observer with its gaze. PathGAN is composed of two parts, the generator and the discriminator. Both parts extract features from images using off-the-shelf networks, and train recurrent layers to generate or discriminate scanpaths accordingly. In scanpath prediction, the stochastic nature of the data makes it very difficult to generate realistic predictions using supervised learning strategies, but we adopt adversarial training as a suitable alternative. Our experiments prove how PathGAN improves the state of the art of visual scanpath prediction on the iSUN and Salient360! datasets. Source code and models are available at https://imatge-upc.github.io/pathgan/

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes