CVApr 17

Active World-Model with 4D-informed Retrieval for Exploration and Awareness

arXiv:2604.1673333.9h-index: 6
AI Analysis

This work addresses the challenge of physical awareness in large dynamic environments for reinforcement learning agents, offering a surrogate environment to reduce costly real-world exploration.

AW4RE introduces a generative world model that uses 4D-informed retrieval to predict observations conditioned on sensing actions, enabling efficient exploration in partially observable environments. It outperforms geometry-aware baselines in prediction consistency under extreme viewpoint shifts and sparse data.

Physical awareness, especially in a large and dynamic environment, is shaped by sensing decisions that determine observability across space, time, and scale, while observations impact the quality of sensing decisions. This loopy information structure makes physical awareness a fundamentally challenging decision problem with partial observations. While in the past decade we have witnessed the unprecedented success of reinforcement learning (RL) in problems with full observability, decision problems with partial observation, such as POMDPs, remain largely open: real-world explorations are excessively costly, while sim-to-real pipeline suffer from unobserved viewpoints. We introduce AW4RE (Active World-model with 4D-informed Retrieval for Exploration), an awareness-centric generative world model that provides a sensor-native surrogate environment for exploring sensing queries. Conditioned on a queried sensing action, AW4RE estimates the action-conditioned observation process. This is done by combining 4D-informed evidence retrieval, action-conditioned geometric support with temporal coherence, and conditional generative completion. Experiments demonstrate that AW4RE produces more grounded and consistent predictions than geometry-aware generative baselines under extreme viewpoint shifts, temporal gaps, and sparse geometric support.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes