CVJan 25, 2024

Appearance Debiased Gaze Estimation via Stochastic Subject-Wise Adversarial Learning

Suneung Kim, Woo-Jeoung Nam, Seong-Whan Lee

arXiv:2401.13865v15.28 citationsPattern Recognition

Originality Incremental advance

AI Analysis

This addresses a key challenge in gaze estimation for computer vision applications, offering improved generalization across subjects, though it is incremental in its approach.

The paper tackled overfitting to person-specific appearance factors in appearance-based gaze estimation by proposing SAZE, a framework that generalizes face appearance via adversarial learning and stochastic subject selection, achieving state-of-the-art performance with errors of 3.89 on MPIIGaze and 4.42 on EyeDiap.

Recently, appearance-based gaze estimation has been attracting attention in computer vision, and remarkable improvements have been achieved using various deep learning techniques. Despite such progress, most methods aim to infer gaze vectors from images directly, which causes overfitting to person-specific appearance factors. In this paper, we address these challenges and propose a novel framework: Stochastic subject-wise Adversarial gaZE learning (SAZE), which trains a network to generalize the appearance of subjects. We design a Face generalization Network (Fgen-Net) using a face-to-gaze encoder and face identity classifier and a proposed adversarial loss. The proposed loss generalizes face appearance factors so that the identity classifier inferences a uniform probability distribution. In addition, the Fgen-Net is trained by a learning mechanism that optimizes the network by reselecting a subset of subjects at every training step to avoid overfitting. Our experimental results verify the robustness of the method in that it yields state-of-the-art performance, achieving 3.89 and 4.42 on the MPIIGaze and EyeDiap datasets, respectively. Furthermore, we demonstrate the positive generalization effect by conducting further experiments using face images involving different styles generated from the generative model.

View on arXiv PDF

Similar