ROCLApr 3

Open-Loop Planning, Closed-Loop Verification: Speculative Verification for VLA

arXiv:2604.0296594.01 citationsHas Code
Predicted impact top 11% in RO · last 90 daysOriginality Incremental advance
AI Analysis

This addresses efficiency and reliability issues for VLA-based robotic manipulation in dynamic settings, representing an incremental improvement over existing action chunking methods.

The paper tackles the problem of high inference cost in Vision-Language-Action (VLA) models for embodied control by proposing Speculative Verification for VLA Control (SV-VLA), which combines open-loop planning with closed-loop verification to reduce computation while maintaining robustness in dynamic environments.

Vision-Language-Action (VLA) models, as large foundation models for embodied control, have shown strong performance in manipulation tasks. However, their performance comes at high inference cost. To improve efficiency, recent methods adopt action chunking, which predicts a sequence of future actions for open-loop execution. Although effective for reducing computation, open-loop execution is sensitive to environmental changes and prone to error accumulation due to the lack of close-loop feedback. To address this limitation, we propose Speculative Verification for VLA Control (SV-VLA), a framework that combines efficient open-loop long-horizon planning with lightweight closed-loop online verification. Specifically, SV-VLA uses a heavy VLA as a low-frequency macro-planner to generate an action chunk together with a planning context, while a lightweight verifier continuously monitors execution based on the latest observations. Conditioned on both the current observation and the planning context, the verifier compares the planned action against a closed-loop reference action and triggers replanning only when necessary. Experiments demonstrate that SV-VLA combines the efficiency of chunked prediction with the robustness of closed-loop control, enabling efficient and reliable VLA-based control in dynamic environments. Code is available: https://github.com/edsad122/SV-VLA.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes