ROAICVHCLGMay 9, 2024

RoboHop: Segment-based Topological Map Representation for Open-World Visual Navigation

arXiv:2405.05792v146 citationsICRA
Originality Highly original
AI Analysis

This work addresses the challenge of robust and interpretable mapping for robots in open-world environments, offering a more flexible alternative to metric or purely topological approaches.

The paper tackles the problem of open-world visual navigation by proposing a novel topological map representation using semantically meaningful image segments as nodes, which enables navigation planning via segment hops and object search with natural language queries. It demonstrates improved robot localization through segment-level retrieval and shows preliminary zero-shot navigation trials in real-world data.

Mapping is crucial for spatial reasoning, planning and robot navigation. Existing approaches range from metric, which require precise geometry-based optimization, to purely topological, where image-as-node based graphs lack explicit object-level reasoning and interconnectivity. In this paper, we propose a novel topological representation of an environment based on "image segments", which are semantically meaningful and open-vocabulary queryable, conferring several advantages over previous works based on pixel-level features. Unlike 3D scene graphs, we create a purely topological graph with segments as nodes, where edges are formed by a) associating segment-level descriptors between pairs of consecutive images and b) connecting neighboring segments within an image using their pixel centroids. This unveils a "continuous sense of a place", defined by inter-image persistence of segments along with their intra-image neighbours. It further enables us to represent and update segment-level descriptors through neighborhood aggregation using graph convolution layers, which improves robot localization based on segment-level retrieval. Using real-world data, we show how our proposed map representation can be used to i) generate navigation plans in the form of "hops over segments" and ii) search for target objects using natural language queries describing spatial relations of objects. Furthermore, we quantitatively analyze data association at the segment level, which underpins inter-image connectivity during mapping and segment-level localization when revisiting the same place. Finally, we show preliminary trials on segment-level `hopping' based zero-shot real-world navigation. Project page with supplementary details: oravus.github.io/RoboHop/

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes