RO AIJan 8

SeqWalker: Sequential-Horizon Vision-and-Language Navigation with Hierarchical Planning

Zebin Han, Xudong Wang, Baichen Liu, Qi Lyu, Zhenduo Shang, Jiahua Dong, Lianqing Liu, Zhi Han

arXiv:2601.04699v17.02 citationsh-index: 4

Originality Incremental advance

AI Analysis

This addresses a challenge in robotics and AI for navigation agents, but it is incremental as it builds on existing vision-and-language navigation methods.

The paper tackles the problem of sequential-horizon vision-and-language navigation, where agents struggle with multi-task instructions due to information overload, and proposes SeqWalker, a hierarchical planning model that achieves superior performance on a new benchmark.

Sequential-Horizon Vision-and-Language Navigation (SH-VLN) presents a challenging scenario where agents should sequentially execute multi-task navigation guided by complex, long-horizon language instructions. Current vision-and-language navigation models exhibit significant performance degradation with such multi-task instructions, as information overload impairs the agent's ability to attend to observationally relevant details. To address this problem, we propose SeqWalker, a navigation model built on a hierarchical planning framework. Our SeqWalker features: i) A High-Level Planner that dynamically selects global instructions into contextually relevant sub-instructions based on the agent's current visual observations, thus reducing cognitive load; ii) A Low-Level Planner incorporating an Exploration-Verification strategy that leverages the inherent logical structure of instructions for trajectory error correction. To evaluate SH-VLN performance, we also extend the IVLN dataset and establish a new benchmark. Extensive experiments are performed to demonstrate the superiority of the proposed SeqWalker.

View on arXiv PDF

Similar