ROAIOct 27, 2025

HyPerNav: Hybrid Perception for Object-Oriented Navigation in Unknown Environment

arXiv:2510.22917v22 citationsh-index: 1
Originality Incremental advance
AI Analysis

This addresses the problem of autonomous robot navigation for robotics, though it is incremental by combining existing perceptual modalities.

The paper tackles object-oriented navigation in unknown environments by integrating egocentric observations and top-down maps using Vision-Language Models, achieving state-of-the-art performance in simulations and real-world tests.

Objective-oriented navigation(ObjNav) enables robot to navigate to target object directly and autonomously in an unknown environment. Effective perception in navigation in unknown environment is critical for autonomous robots. While egocentric observations from RGB-D sensors provide abundant local information, real-time top-down maps offer valuable global context for ObjNav. Nevertheless, the majority of existing studies focus on a single source, seldom integrating these two complementary perceptual modalities, despite the fact that humans naturally attend to both. With the rapid advancement of Vision-Language Models(VLMs), we propose Hybrid Perception Navigation (HyPerNav), leveraging VLMs' strong reasoning and vision-language understanding capabilities to jointly perceive both local and global information to enhance the effectiveness and intelligence of navigation in unknown environments. In both massive simulation evaluation and real-world validation, our methods achieved state-of-the-art performance against popular baselines. Benefiting from hybrid perception approach, our method captures richer cues and finds the objects more effectively, by simultaneously leveraging information understanding from egocentric observations and the top-down map. Our ablation study further proved that either of the hybrid perception contributes to the navigation performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes