CVJan 1, 2020

A Coarse-to-Fine Adaptive Network for Appearance-Based Gaze Estimation

arXiv:2001.00187v1207 citations
AI Analysis

This work addresses gaze estimation for applications like human-computer interaction, but it is incremental as it builds on existing methods by better integrating face and eye features.

The paper tackles the problem of improving gaze estimation accuracy by proposing a coarse-to-fine adaptive network (CA-Net) that leverages the intrinsic correlation between face and eye images, achieving state-of-the-art results on MPIIGaze and EyeDiap datasets.

Human gaze is essential for various appealing applications. Aiming at more accurate gaze estimation, a series of recent works propose to utilize face and eye images simultaneously. Nevertheless, face and eye images only serve as independent or parallel feature sources in those works, the intrinsic correlation between their features is overlooked. In this paper we make the following contributions: 1) We propose a coarse-to-fine strategy which estimates a basic gaze direction from face image and refines it with corresponding residual predicted from eye images. 2) Guided by the proposed strategy, we design a framework which introduces a bi-gram model to bridge gaze residual and basic gaze direction, and an attention component to adaptively acquire suitable fine-grained feature. 3) Integrating the above innovations, we construct a coarse-to-fine adaptive network named CA-Net and achieve state-of-the-art performances on MPIIGaze and EyeDiap.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes