It's Written All Over Your Face: Full-Face Appearance-Based Gaze Estimation
This work addresses gaze estimation for human affect analysis, offering significant performance gains in challenging conditions like extreme head poses, but it is incremental as it builds on prior full-face region insights.
The paper tackles gaze estimation by proposing a full-face appearance-based method that uses a convolutional neural network with spatial weights to enhance or suppress facial regions, achieving state-of-the-art improvements of up to 14.3% on MPIIGaze and 27.7% on EYEDIAP for person-independent 3D gaze estimation.
Eye gaze is an important non-verbal cue for human affect analysis. Recent gaze estimation work indicated that information from the full face region can benefit performance. Pushing this idea further, we propose an appearance-based method that, in contrast to a long-standing line of work in computer vision, only takes the full face image as input. Our method encodes the face image using a convolutional neural network with spatial weights applied on the feature maps to flexibly suppress or enhance information in different facial regions. Through extensive evaluation, we show that our full-face method significantly outperforms the state of the art for both 2D and 3D gaze estimation, achieving improvements of up to 14.3% on MPIIGaze and 27.7% on EYEDIAP for person-independent 3D gaze estimation. We further show that this improvement is consistent across different illumination conditions and gaze directions and particularly pronounced for the most challenging extreme head poses.