ROCVJul 9, 2025

LOVON: Legged Open-Vocabulary Object Navigator

arXiv:2507.06747v17 citationsh-index: 6
Originality Incremental advance
AI Analysis

This addresses the problem of long-horizon object navigation for robotic systems in dynamic, unstructured environments, representing an incremental improvement through integration of existing components.

The paper tackles the challenge of object navigation in open-world environments by proposing LOVON, a framework that integrates large language models for hierarchical task planning with open-vocabulary visual detection models, achieving successful completion of long-sequence tasks involving real-time detection, search, and navigation toward dynamic targets across different legged robots.

Object navigation in open-world environments remains a formidable and pervasive challenge for robotic systems, particularly when it comes to executing long-horizon tasks that require both open-world object detection and high-level task planning. Traditional methods often struggle to integrate these components effectively, and this limits their capability to deal with complex, long-range navigation missions. In this paper, we propose LOVON, a novel framework that integrates large language models (LLMs) for hierarchical task planning with open-vocabulary visual detection models, tailored for effective long-range object navigation in dynamic, unstructured environments. To tackle real-world challenges including visual jittering, blind zones, and temporary target loss, we design dedicated solutions such as Laplacian Variance Filtering for visual stabilization. We also develop a functional execution logic for the robot that guarantees LOVON's capabilities in autonomous navigation, task adaptation, and robust task completion. Extensive evaluations demonstrate the successful completion of long-sequence tasks involving real-time detection, search, and navigation toward open-vocabulary dynamic targets. Furthermore, real-world experiments across different legged robots (Unitree Go2, B2, and H1-2) showcase the compatibility and appealing plug-and-play feature of LOVON.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes