ROAISYOct 13, 2025

Ego-Vision World Model for Humanoid Contact Planning

arXiv:2510.11682v14 citationsh-index: 54
Originality Incremental advance
AI Analysis

This work addresses the challenge of contact-aware planning for humanoid robots in unstructured settings, representing an incremental advance over existing methods.

The paper tackles the problem of enabling humanoid robots to use physical contact for autonomy in unstructured environments by proposing a framework combining a learned world model with sampling-based Model Predictive Control, achieving robust, real-time contact planning on a physical humanoid with improved data efficiency and multi-task capability over on-policy RL.

Enabling humanoid robots to exploit physical contact, rather than simply avoid collisions, is crucial for autonomy in unstructured environments. Traditional optimization-based planners struggle with contact complexity, while on-policy reinforcement learning (RL) is sample-inefficient and has limited multi-task ability. We propose a framework combining a learned world model with sampling-based Model Predictive Control (MPC), trained on a demonstration-free offline dataset to predict future outcomes in a compressed latent space. To address sparse contact rewards and sensor noise, the MPC uses a learned surrogate value function for dense, robust planning. Our single, scalable model supports contact-aware tasks, including wall support after perturbation, blocking incoming objects, and traversing height-limited arches, with improved data efficiency and multi-task capability over on-policy RL. Deployed on a physical humanoid, our system achieves robust, real-time contact planning from proprioception and ego-centric depth images. Website: https://ego-vcp.github.io/

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes