Ya Xiong

RO
h-index9
5papers
44citations
Novelty48%
AI Score43

5 Papers

ROJan 5
Vision-Based Early Fault Diagnosis and Self-Recovery for Strawberry Harvesting Robots

Meili Sun, Chunjiang Zhao, Lichao Yang et al.

Strawberry harvesting robots faced persistent challenges such as low integration of visual perception, fruit-gripper misalignment, empty grasping/misgrasp, and strawberry slippage from the gripper due to insufficient gripping force, all of which compromised harvesting stability and efficiency in orchard environments. To overcome these issues, this paper proposed a visual fault diagnosis and self-recovery framework that integrated multi-task perception with corrective control strategies. At the core of this framework was SRR-Net, an end-to-end multi-task perception model that simultaneously performed strawberry detection, segmentation, and ripeness estimation, thereby unifying visual perception with fault diagnosis.Based on this integrated perception, a relative error compensation method based on the simultaneous target-gripper detection was designed to address positional misalignment, correcting deviations when error exceeded the tolerance threshold.To mitigate empty grasping/misgrasp and fruit-slippage faults, an early abort strategy was implemented. A micro-optical camera embedded in the end-effector provided real-time visual feedback, enabling grasp classification during the deflating stage and strawberry slip prediction during snap-off through MobileNet V3-Small classifier and a time-series LSTM classifier. Experiments demonstrated that SRR-Net maintained high perception accuracy. For detection, it achieved a precision of 0.895 and recall of 0.813 on strawberries, and 0.972/0.958 on hands. In segmentation, it yielded a precision of 0.887 and recall of 0.747 for strawberries, and 0.974/0.947 for hands. For ripeness estimation, SRR-Net attained a mean absolute error of 0.035, while simultaneously supporting multi-task perception and sustaining a competitive inference speed of 163.35 FPS.

ROMar 6
HarvestFlex: Strawberry Harvesting via Vision-Language-Action Policy Adaptation in the Wild

Ziyang Zhao, Shuheng Wang, Zhonghua Miao et al.

This work presents the first study on transferring vision-language-action (VLA) policies to real greenhouse tabletop strawberry harvesting, a long-horizon, unstructured task challenged by occlusion and specular reflections. We built an end-to-end closed-loop system on the HarvestFlex platform using three-view RGB sensing (two fixed scene views plus a wrist-mounted view) and intentionally avoided depth clouds and explicit geometric calibration. We collected 3.71 h of VR teleoperated demonstrations (227 episodes) and fine-tuned pi_0, pi_0.5, and WALL-OSS with full fine-tuning and LoRA. Under a unified 50 trials real-greenhouse protocol and metrics spanning completion, pi_0.5 with full fine-tuning achieved success rate of 74.0% with 32.6 s/pick and damage rate of 4.1%. Asynchronous inference-control decoupling further improved performance over synchronous deployment. Results showed non-trivial closed-loop picking with fewer than four hours of real data, while remaining limited by close-range observability loss and contact-dynamics mismatch. A demonstration video is available at: https://youtu.be/bN8ZowZKPMI.

CVJul 31, 2025
Online Estimation of Table-Top Grown Strawberry Mass in Field Conditions with Occlusions

Jinshan Zhen, Yuanyue Ge, Tianxiao Zhu et al.

Accurate mass estimation of table-top grown strawberries under field conditions remains challenging due to frequent occlusions and pose variations. This study proposes a vision-based pipeline integrating RGB-D sensing and deep learning to enable non-destructive, real-time and online mass estimation. The method employed YOLOv8-Seg for instance segmentation, Cycle-consistent generative adversarial network (CycleGAN) for occluded region completion, and tilt-angle correction to refine frontal projection area calculations. A polynomial regression model then mapped the geometric features to mass. Experiments demonstrated mean mass estimation errors of 8.11% for isolated strawberries and 10.47% for occluded cases. CycleGAN outperformed large mask inpainting (LaMa) model in occlusion recovery, achieving superior pixel area ratios (PAR) (mean: 0.978 vs. 1.112) and higher intersection over union (IoU) scores (92.3% vs. 47.7% in the [0.9-1] range). This approach addresses critical limitations of traditional methods, offering a robust solution for automated harvesting and yield monitoring with complex occlusion patterns.

ROApr 20, 2020
Push and Drag: An Active Obstacle Separation Method for Fruit Harvesting Robots

Ya Xiong, Yuanyue Ge, Pål Johan From

Selectively picking a target fruit surrounded by obstacles is one of the major challenges for fruit harvesting robots. Different from traditional obstacle avoidance methods, this paper presents an active obstacle separation strategy that combines push and drag motions. The separation motion and trajectory are generated based on the 3D visual perception of the obstacle information around the target. A linear push is used to clear the obstacles from the area below the target, while a zig-zag push that contains several linear motions is proposed to push aside more dense obstacles. The zig-zag push can generate multi-directional pushes and the side-to-side motion can break the static contact force between the target and obstacles, thus helping the gripper to receive a target in more complex situations. Moreover, we propose a novel drag operation to address the issue of mis-capturing obstacles located above the target, in which the gripper drags the target to a place with fewer obstacles and then pushes back to move the obstacles aside for further detachment. Furthermore, an image processing pipeline consisting of color thresholding, object detection using deep learning and point cloud operation, is developed to implement the proposed method on a harvesting robot. Field tests show that the proposed method can improve the picking performance substantially. This method helps to enable complex clusters of fruits to be harvested with a higher success rate than conventional methods.

ROApr 25, 2018
Design and Evaluation of a Novel Cable-Driven Gripper with Perception Capabilities for Strawberry Picking Robots

Ya Xiong, Pal Johan From, Volkan Isler

This paper presents a novel cable-driven gripper with perception capabilities for autonomous harvesting of strawberries. Experiments show that the gripper allows for more accurate and faster picking of strawberries compared to existing systems. The gripper consists of four functional parts for sensing, picking, transmission, and storing. It has six fingers that open to form a closed space to swallow a target strawberry and push other surrounding berries away from the target. Equipped with three IR sensors, the gripper controls a manipulator arm to correct for positional error, and can thus pick strawberries that are not exactly localized by the vision algorithm, improving the robustness. Experiments show that the gripper is gentle on the berries as it merely cuts the stem and there is no physical interaction with the berries during the cutting process. We show that the gripper has close-to-perfect successful picking rate when addressing isolated strawberries. By including internal perception, we get high positional error tolerance, and avoid using slow, high-level closed-loop control. Moreover, the gripper can store several berries, which reduces the overall travel distance for the manipulator, and decreases the time needed to pick a single strawberry substantially. The experiments show that the gripper design decreased picking execution time noticeably compared to results found in literature.