Appearance-Based Gaze Estimation in the Wild
This work addresses gaze estimation for applications like human-computer interaction by providing a more variable dataset and improved method, though it is incremental as it builds on existing techniques.
The authors tackled the problem of appearance-based gaze estimation in real-world settings by introducing the MPIIGaze dataset with 213,659 images from natural laptop use, and their multimodal convolutional neural network method significantly outperformed state-of-the-art methods in cross-dataset evaluation.
Appearance-based gaze estimation is believed to work well in real-world settings, but existing datasets have been collected under controlled laboratory conditions and methods have been not evaluated across multiple datasets. In this work we study appearance-based gaze estimation in the wild. We present the MPIIGaze dataset that contains 213,659 images we collected from 15 participants during natural everyday laptop use over more than three months. Our dataset is significantly more variable than existing ones with respect to appearance and illumination. We also present a method for in-the-wild appearance-based gaze estimation using multimodal convolutional neural networks that significantly outperforms state-of-the art methods in the most challenging cross-dataset evaluation. We present an extensive evaluation of several state-of-the-art image-based gaze estimation algorithms on three current datasets, including our own. This evaluation provides clear insights and allows us to identify key research challenges of gaze estimation in the wild.