CVRONov 18, 2024

Generative World Explorer

arXiv:2411.11844v313 citationsh-index: 47
Originality Incremental advance
AI Analysis

This addresses the problem of reducing physical exploration needs for embodied AI agents, offering a novel approach but with incremental impact as it builds on existing decision-making models.

The paper tackles planning with partial observation in embodied AI by introducing Generative World Explorer (Genex), a framework that enables agents to mentally explore 3D worlds and update beliefs with imagined observations, resulting in improved decision-making for an LLM agent.

Planning with partial observation is a central challenge in embodied AI. A majority of prior works have tackled this challenge by developing agents that physically explore their environment to update their beliefs about the world state. In contrast, humans can $\textit{imagine}$ unseen parts of the world through a mental exploration and $\textit{revise}$ their beliefs with imagined observations. Such updated beliefs can allow them to make more informed decisions, without necessitating the physical exploration of the world at all times. To achieve this human-like ability, we introduce the $\textit{Generative World Explorer (Genex)}$, an egocentric world exploration framework that allows an agent to mentally explore a large-scale 3D world (e.g., urban scenes) and acquire imagined observations to update its belief. This updated belief will then help the agent to make a more informed decision at the current step. To train $\textit{Genex}$, we create a synthetic urban scene dataset, Genex-DB. Our experimental results demonstrate that (1) $\textit{Genex}$ can generate high-quality and consistent observations during long-horizon exploration of a large virtual physical world and (2) the beliefs updated with the generated observations can inform an existing decision-making model (e.g., an LLM agent) to make better plans.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes