CVCLMay 7

MobileEgo Anywhere: Open Infrastructure for long horizon egocentric data on commodity hardware

arXiv:2605.0594521.5Has Code
AI Analysis

This work lowers the hardware barrier for collecting long-horizon egocentric data, enabling broader participation in robotic data collection for the VLA research community.

MobileEgo Anywhere addresses the lack of long-horizon egocentric datasets for VLA models by providing a framework using commodity smartphones to collect hour-plus trajectories, releasing a 200-hour dataset, an open-source mobile app, and a processing pipeline.

The recent advancement of Vision Language Action (VLA) models has driven a critical demand for large scale egocentric datasets. However, existing datasets are often limited by short episode durations, typically spanning only a few minutes, which fails to capture the long horizon temporal dependencies necessary for complex robotic task execution. To bridge this gap, we present MobileEgo Anywhere, a framework designed to facilitate the collection of robust, hour plus egocentric trajectories using commodity mobile hardware. We leverage the ubiquitous sensor suites of modern smartphones to provide high fidelity, long term camera pose tracking, effectively removing the high hardware barriers associated with traditional robotics data collection. Our contributions are three fold: (1) we release a novel dataset comprising 200 hours of diverse, long form egocentric data with persistent state tracking; (2) we open source a mobile application that enables any user to record egocentric data, and (3) we provide a comprehensive processing pipeline to convert raw mobile captures into standardized, training ready formats for Vision Language Action model and foundation model research. By democratizing the data collection process, this work enables the massive scale acquisition of long horizon data across varied global environments, accelerating the development of generalizable robotic policies.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes