Tao Teng

RO
6papers
24citations
Novelty45%
AI Score40

6 Papers

ROMar 27
Adapt as You Say: Online Interactive Bimanual Skill Adaptation via Human Language Feedback

Zhuo Li, Dianxi Li, Tao Teng et al.

Developing general-purpose robots capable of autonomously operating in human living environments requires the ability to adapt to continuously evolving task conditions. However, adapting high-dimensional coordinated bimanual skills to novel task variations at deployment remains a fundamental challenge. In this work, we present BiSAIL (Bimanual Skill Adaptation via Interactive Language), a novel framework that enables zero-shot online adaptation of offline-learned bimanual skills through interactive language feedback. The key idea of BiSAIL is to adopt a hierarchical reason-then-modulate paradigm, which first infers generalized adaptation objectives from multimodal task variations, and then adapts bimanual motions via diffusion modulation to achieve the inferred objectives. Extensive real-robot experiments across six bimanual tasks and two dual-arm platforms demonstrate that BiSAIL significantly outperforms existing methods in human-in-the-loop adaptability, task generalization and cross-embodiment scalability. This work enables the development of adaptive bimanual assistants that can be flexibly customized by non-expert users via intuitive verbal corrections. Experimental videos and code are available at https://rip4kobe.github.io/BiSAIL/.

RONov 18, 2025
Towards Deploying VLA without Fine-Tuning: Plug-and-Play Inference-Time VLA Policy Steering via Embodied Evolutionary Diffusion

Zhuo Li, Junjia Liu, Zhipeng Dong et al.

Vision-Language-Action (VLA) models have demonstrated significant potential in real-world robotic manipulation. However, pre-trained VLA policies still suffer from substantial performance degradation during downstream deployment. Although fine-tuning can mitigate this issue, its reliance on costly demonstration collection and intensive computation makes it impractical in real-world settings. In this work, we introduce VLA-Pilot, a plug-and-play inference-time policy steering method for zero-shot deployment of pre-trained VLA without any additional fine-tuning or data collection. We evaluate VLA-Pilot on six real-world downstream manipulation tasks across two distinct robotic embodiments, encompassing both in-distribution and out-of-distribution scenarios. Experimental results demonstrate that VLA-Pilot substantially boosts the success rates of off-the-shelf pre-trained VLA policies, enabling robust zero-shot generalization to diverse tasks and embodiments. Experimental videos and code are available at: https://rip4kobe.github.io/vla-pilot/.

ROSep 15, 2021
Towards Precise Pruning Points Detection using Semantic-Instance-Aware Plant Models for Grapevine Winter Pruning Automation

Miguel Fernandes, Antonello Scaldaferri, Paolo Guadagna et al.

Grapevine winter pruning is a complex task, that requires skilled workers to execute it correctly. The complexity makes it time consuming. It is an operation that requires about 80-120 hours per hectare annually, making an automated robotic system that helps in speeding up the process a crucial tool in large-size vineyards. We will describe (a) a novel expert annotated dataset for grapevine segmentation, (b) a state of the art neural network implementation and (c) generation of pruning points following agronomic rules, leveraging the simplified structure of the plant. With this approach, we are able to generate a set of pruning points on the canes, paving the way towards a correct automation of grapevine winter pruning.

CVJun 8, 2021
Grapevine Winter Pruning Automation: On Potential Pruning Points Detection through 2D Plant Modeling using Grapevine Segmentation

Miguel Fernandes, Antonello Scaldaferri, Giuseppe Fiameni et al.

Grapevine winter pruning is a complex task, that requires skilled workers to execute it correctly. The complexity of this task is also the reason why it is time consuming. Considering that this operation takes about 80-120 hours/ha to be completed, and therefore is even more crucial in large-size vineyards, an automated system can help to speed up the process. To this end, this paper presents a novel multidisciplinary approach that tackles this challenging task by performing object segmentation on grapevine images, used to create a representative model of the grapevine plants. Second, a set of potential pruning points is generated from this plant representation. We will describe (a) a methodology for data acquisition and annotation, (b) a neural network fine-tuning for grapevine segmentation, (c) an image processing based method for creating the representative model of grapevines, starting from the inferred segmentation and (d) potential pruning points detection and localization, based on the plant model which is a simplification of the grapevine structure. With this approach, we are able to identify a significant set of potential pruning points on the canes, that can be used, with further selection, to derive the final set of the real pruning points.

ROMay 22, 2021
Whole-Body Control on Non-holonomic Mobile Manipulation for Grapevine Winter Pruning Automation

Tao Teng, Miguel Fernandes, Matteo Gatti et al.

Mobile manipulators that combine mobility and manipulability, are increasingly being used for various unstructured application scenarios in the field, e.g. vineyards. Therefore, the coordinated motion of the mobile base and manipulator is an essential feature of the overall performance. In this paper, we explore a whole-body motion controller of a robot which is composed of a 2-DoFs non-holonomic wheeled mobile base with a 7-DoFs manipulator (non-holonomic wheeled mobile manipulator, NWMM) This robotic platform is designed to efficiently undertake complex grapevine pruning tasks. In the control framework, a task priority coordinated motion of the NWMM is guaranteed. Lower-priority tasks are projected into the null space of the top-priority tasks so that higher-priority tasks are completed without interruption from lower-priority tasks. The proposed controller was evaluated in a grapevine spur pruning experiment scenario.

ROMar 9, 2021
Formulating Intuitive Stack-of-Tasks using Visuo-Tactile Perception for Collaborative Human-Robot Fine Manipulation

Sunny Katyara, Nikhil Deshpande, Fanny Ficuciello et al.

Enabling robots to work in close proximity to humans necessitates a control framework that does not only incorporate multi-sensory information for autonomous and coordinated interactions but also has perceptive task planning to ensure an adaptable and flexible collaborative behaviour. In this research, an intuitive stack-of-tasks (iSoT) formulation is proposed, that defines the robot's actions by considering the human-arm postures and the task progression. The framework is augmented with visuo-tactile information to effectively perceive the collaborative environment and intuitively switch between the planned sub-tasks. The visual feedback from depth cameras monitors and estimates the objects' poses and human-arm postures, while the tactile data provides the exploration skills to detect and maintain the desired contacts to avoid object slippage. To evaluate the performance, effectiveness and usability of the proposed framework, assembly and disassembly tasks, performed by the human-human and human-robot partners, are considered and analyzed using distinct evaluation metrics i.e, approach adaptation, grasp correction, task coordination latency, cumulative posture deviation, and task repeatability.