CVROFeb 29, 2024

DOZE: A Dataset for Open-Vocabulary Zero-Shot Object Navigation in Dynamic Environments

arXiv:2402.19007v28 citationsh-index: 11IEEE Robot Autom Lett
AI Analysis

This work addresses the challenge of developing more realistic and robust embodied AI agents for navigation tasks, though it is incremental as it focuses on dataset creation rather than a new algorithm.

The authors tackled the problem of zero-shot object navigation in dynamic environments by creating a new dataset (DOZE) with moving obstacles and diverse objects, revealing that existing methods have significant room for improvement in navigation efficiency, safety, and object recognition accuracy.

Zero-Shot Object Navigation (ZSON) requires agents to autonomously locate and approach unseen objects in unfamiliar environments and has emerged as a particularly challenging task within the domain of Embodied AI. Existing datasets for developing ZSON algorithms lack consideration of dynamic obstacles, object attribute diversity, and scene texts, thus exhibiting noticeable discrepancies from real-world situations. To address these issues, we propose a Dataset for Open-Vocabulary Zero-Shot Object Navigation in Dynamic Environments (DOZE) that comprises ten high-fidelity 3D scenes with over 18k tasks, aiming to mimic complex, dynamic real-world scenarios. Specifically, DOZE scenes feature multiple moving humanoid obstacles, a wide array of open-vocabulary objects, diverse distinct-attribute objects, and valuable textual hints. Besides, different from existing datasets that only provide collision checking between the agent and static obstacles, we enhance DOZE by integrating capabilities for detecting collisions between the agent and moving obstacles. This novel functionality enables the evaluation of the agents' collision avoidance abilities in dynamic environments. We test four representative ZSON methods on DOZE, revealing substantial room for improvement in existing approaches concerning navigation efficiency, safety, and object recognition accuracy. Our dataset can be found at https://DOZE-Dataset.github.io/.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes