HCMay 29

Gaze Prediction as Time-Series Forecasting for Virtual Reality Applications: Quantifying Performance Variability and Extreme-Case Errors

arXiv:2509.071264.7h-index: 35
Predicted impact top 89% in HC · last 90 daysOriginality Synthesis-oriented
AI Analysis

For VR developers, it highlights the need for robust gaze prediction evaluation to ensure perceptually stable foveated rendering.

The paper evaluates recurrent, transformer-based, and classification-guided architectures for gaze prediction in VR, finding that low median errors do not guarantee low extreme-case errors, and advocating for P95-focused, subject-specific metrics.

Gaze prediction is essential for addressing motion-to-photon latency and ensuring seamless foveated rendering in Virtual Reality. The reliability of gaze forecasting is highly sensitive to individual differences and the eye movements being predicted. We evaluate recurrent, transformer-based, and classification-guided architectures to assess their generalization capabilities across oculomotor events. Using the GazeBase VR and Meta Quest Pro datasets, we analyzed the relationship between the median (P50) and high-percentile (P95) error profiles across subjects. The analysis reveals significant performance variability, showing that subjects with low P50 errors do not always exhibit the lowest extreme-case errors. Consequently, low median errors do not guarantee the robustness of the utilized solution. We discuss inference performance and address the class imbalance problem in short-term gaze prediction. These results identify a gap in standardized evaluation methods, necessitating a shift toward P95-focused, subject-specific metrics to develop reliable and perceptually stable gaze-contingent systems.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes