CVAug 28, 2018

Temporal Saliency Adaptation in Egocentric Videos

arXiv:1808.09559v21 citations
Originality Synthesis-oriented
AI Analysis

This work addresses saliency prediction in egocentric videos, which is incremental as it adapts existing methods to a specific domain.

The paper adapts an image saliency prediction model to egocentric videos, finding that temporal adaptation improves performance when viewers are stationary with a narrow field of view, as validated on the EgoMon dataset.

This work adapts a deep neural model for image saliency prediction to the temporal domain of egocentric video. We compute the saliency map for each video frame, firstly with an off-the-shelf model trained from static images, secondly by adding a a convolutional or conv-LSTM layers trained with a dataset for video saliency prediction. We study each configuration on EgoMon, a new dataset made of seven egocentric videos recorded by three subjects in both free-viewing and task-driven set ups. Our results indicate that the temporal adaptation is beneficial when the viewer is not moving and observing the scene from a narrow field of view. Encouraged by this observation, we compute and publish the saliency maps for the EPIC Kitchens dataset, in which viewers are cooking. Source code and models available at https://imatge-upc.github.io/saliency-2018-videosalgan/

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes