CVAINov 14, 2025

3D Gaussian and Diffusion-Based Gaze Redirection

arXiv:2511.11231v1h-index: 10Has Code
Originality Incremental advance
AI Analysis

This provides a superior method for creating synthetic training data to enhance gaze estimation models, addressing a domain-specific need in computer vision.

The paper tackled the problem of high-fidelity gaze redirection for generating augmented data to improve gaze estimators, achieving a state-of-the-art reduction in gaze error by 4.1% to 6.353 degrees.

High-fidelity gaze redirection is critical for generating augmented data to improve the generalization of gaze estimators. 3D Gaussian Splatting (3DGS) models like GazeGaussian represent the state-of-the-art but can struggle with rendering subtle, continuous gaze shifts. In this paper, we propose DiT-Gaze, a framework that enhances 3D gaze redirection models using a novel combination of Diffusion Transformer (DiT), weak supervision across gaze angles, and an orthogonality constraint loss. DiT allows higher-fidelity image synthesis, while our weak supervision strategy using synthetically generated intermediate gaze angles provides a smooth manifold of gaze directions during training. The orthogonality constraint loss mathematically enforces the disentanglement of internal representations for gaze, head pose, and expression. Comprehensive experiments show that DiT-Gaze sets a new state-of-the-art in both perceptual quality and redirection accuracy, reducing the state-of-the-art gaze error by 4.1% to 6.353 degrees, providing a superior method for creating synthetic training data. Our code and models will be made available for the research community to benchmark against.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes