Causal Representation-Based Domain Generalization on Gaze Estimation
This addresses domain generalization for gaze estimation, an incremental improvement in computer vision applications like human-computer interaction.
The paper tackles the problem of domain discrepancy degrading gaze estimation performance by proposing the CauGE framework, which uses causal mechanisms and adversarial training to extract domain-invariant features, achieving state-of-the-art results on a benchmark.
The availability of extensive datasets containing gaze information for each subject has significantly enhanced gaze estimation accuracy. However, the discrepancy between domains severely affects a model's performance explicitly trained for a particular domain. In this paper, we propose the Causal Representation-Based Domain Generalization on Gaze Estimation (CauGE) framework designed based on the general principle of causal mechanisms, which is consistent with the domain difference. We employ an adversarial training manner and an additional penalizing term to extract domain-invariant features. After extracting features, we position the attention layer to make features sufficient for inferring the actual gaze. By leveraging these modules, CauGE ensures that the neural networks learn from representations that meet the causal mechanisms' general principles. By this, CauGE generalizes across domains by extracting domain-invariant features, and spurious correlations cannot influence the model. Our method achieves state-of-the-art performance in the domain generalization on gaze estimation benchmark.