ROAIApr 22

A Vision-Language-Action Model for Adaptive Ultrasound-Guided Needle Insertion and Needle Tracking

arXiv:2604.2034766.11 citationsh-index: 9
Predicted impact top 29% in RO · last 90 daysOriginality Highly original
AI Analysis

This work addresses a critical problem in medical robotics for automated needle insertion, offering a unified adaptive approach that could enhance safety and efficiency in ultrasound-guided interventions.

The paper tackles the challenge of ultrasound-guided needle insertion by proposing a Vision-Language-Action model that unifies needle tracking and control, resulting in higher tracking accuracy, improved insertion success rates, and reduced procedure time compared to state-of-the-art methods and manual operation.

Ultrasound (US)-guided needle insertion is a critical yet challenging procedure due to dynamic imaging conditions and difficulties in needle visualization. Many methods have been proposed for automated needle insertion, but they often rely on hand-crafted pipelines with modular controllers, whose performance degrades in challenging cases. In this paper, a Vision-Language-Action (VLA) model is proposed for adaptive and automated US-guided needle insertion and tracking on a robotic ultrasound (RUS) system. This framework provides a unified approach to needle tracking and needle insertion control, enabling real-time, dynamically adaptive adjustment of insertion based on the obtained needle position and environment awareness. To achieve real-time and end-to-end tracking, a Cross-Depth Fusion (CDF) tracking head is proposed, integrating shallow positional and deep semantic features from the large-scale vision backbone. To adapt the pretrained vision backbone for tracking tasks, a Tracking-Conditioning (TraCon) register is introduced for parameter-efficient feature conditioning. After needle tracking, an uncertainty-aware control policy and an asynchronous VLA pipeline are presented for adaptive needle insertion control, ensuring timely decision-making for improved safety and outcomes. Extensive experiments on both needle tracking and insertion show that our method consistently outperforms state-of-the-art trackers and manual operation, achieving higher tracking accuracy, improved insertion success rates, and reduced procedure time, highlighting promising directions for RUS-based intelligent intervention.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes