CVMar 18, 2024

GaussNav: Gaussian Splatting for Visual Navigation

arXiv:2403.11625v362 citationsh-index: 67Has CodeIEEE Trans Pattern Anal Mach Intell
Originality Highly original
AI Analysis

This addresses the challenge of recognizing specific objects across viewpoints in embodied vision, offering a significant improvement over existing methods for instance-level navigation tasks.

The paper tackles the problem of Instance ImageGoal Navigation by proposing GaussNav, a framework using 3D Gaussian Splatting to create detailed scene maps, resulting in an SPL increase from 0.347 to 0.578 on the HM3D dataset.

In embodied vision, Instance ImageGoal Navigation (IIN) requires an agent to locate a specific object depicted in a goal image within an unexplored environment. The primary challenge of IIN arises from the need to recognize the target object across varying viewpoints while ignoring potential distractors. Existing map-based navigation methods typically use Bird's Eye View (BEV) maps, which lack detailed texture representation of a scene. Consequently, while BEV maps are effective for semantic-level visual navigation, they are struggling for instance-level tasks. To this end, we propose a new framework for IIN, Gaussian Splatting for Visual Navigation (GaussNav), which constructs a novel map representation based on 3D Gaussian Splatting (3DGS). The GaussNav framework enables the agent to memorize both the geometry and semantic information of the scene, as well as retain the textural features of objects. By matching renderings of similar objects with the target, the agent can accurately identify, ground, and navigate to the specified object. Our GaussNav framework demonstrates a significant performance improvement, with Success weighted by Path Length (SPL) increasing from 0.347 to 0.578 on the challenging Habitat-Matterport 3D (HM3D) dataset. The source code is publicly available at the link: https://github.com/XiaohanLei/GaussNav.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes