CVDec 3, 2024

EgoCast: Forecasting Egocentric Human Pose in the Wild

arXiv:2412.02903v110 citationsh-index: 28WACV
Originality Incremental advance
AI Analysis

This addresses the need for accurate human pose forecasting to enhance immersion in Augmented Reality, though it appears incremental by building on existing frameworks.

The paper tackles the problem of forecasting 3D human pose from egocentric videos and proprioceptive data in realistic settings, introducing EgoCast which eliminates the need for past groundtruth poses during inference. It significantly outperforms state-of-the-art approaches on the Ego-Exo4D Body Pose 2024 Challenge.

Accurately estimating and forecasting human body pose is important for enhancing the user's sense of immersion in Augmented Reality. Addressing this need, our paper introduces EgoCast, a bimodal method for 3D human pose forecasting using egocentric videos and proprioceptive data. We study the task of human pose forecasting in a realistic setting, extending the boundaries of temporal forecasting in dynamic scenes and building on the current framework for current pose estimation in the wild. We introduce a current-frame estimation module that generates pseudo-groundtruth poses for inference, eliminating the need for past groundtruth poses typically required by current methods during forecasting. Our experimental results on the recent Ego-Exo4D and Aria Digital Twin datasets validate EgoCast for real-life motion estimation. On the Ego-Exo4D Body Pose 2024 Challenge, our method significantly outperforms the state-of-the-art approaches, laying the groundwork for future research in human pose estimation and forecasting in unscripted activities with egocentric inputs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes