CVFeb 28, 2019

Extended Gaze Following: Detecting Objects in Videos Beyond the Camera Field of View

arXiv:1902.10953v116 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of object detection in videos for applications like surveillance or human-computer interaction, but it is incremental as it builds on existing gaze following methods.

The paper tackles the problem of detecting objects in videos based solely on gaze directions, even when objects are outside the camera's field of view, by proposing a novel spatial representation and convolutional networks, achieving empirical validation on a public dataset.

In this paper we address the problems of detecting objects of interest in a video and of estimating their locations, solely from the gaze directions of people present in the video. Objects can be indistinctly located inside or outside the camera field of view. We refer to this problem as extended gaze following. The contributions of the paper are the followings. First, we propose a novel spatial representation of the gaze directions adopting a top-view perspective. Second, we develop several convolutional encoder/decoder networks to predict object locations and compare them with heuristics and with classical learning-based approaches. Third, in order to train the proposed models, we generate a very large number of synthetic scenarios employing a probabilistic formulation. Finally, our methodology is empirically validated using a publicly available dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes