CVAug 20, 2025

LookOut: Real-World Humanoid Egocentric Navigation

arXiv:2508.14466v18 citationsh-index: 4
Originality Incremental advance
AI Analysis

This work addresses a challenging real-world navigation problem for applications like humanoid robotics and VR/AR, though it is incremental in combining existing techniques with new data.

The paper tackles the problem of predicting future 6D head poses from egocentric video to enable collision-free navigation, introducing a dataset and model that learns human-like behaviors such as waiting and rerouting, with experiments showing generalization to unseen environments.

The ability to predict collision-free future trajectories from egocentric observations is crucial in applications such as humanoid robotics, VR / AR, and assistive navigation. In this work, we introduce the challenging problem of predicting a sequence of future 6D head poses from an egocentric video. In particular, we predict both head translations and rotations to learn the active information-gathering behavior expressed through head-turning events. To solve this task, we propose a framework that reasons over temporally aggregated 3D latent features, which models the geometric and semantic constraints for both the static and dynamic parts of the environment. Motivated by the lack of training data in this space, we further contribute a data collection pipeline using the Project Aria glasses, and present a dataset collected through this approach. Our dataset, dubbed Aria Navigation Dataset (AND), consists of 4 hours of recording of users navigating in real-world scenarios. It includes diverse situations and navigation behaviors, providing a valuable resource for learning real-world egocentric navigation policies. Extensive experiments show that our model learns human-like navigation behaviors such as waiting / slowing down, rerouting, and looking around for traffic while generalizing to unseen environments. Check out our project webpage at https://sites.google.com/stanford.edu/lookout.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes