CVMar 8, 2023
O2RNet: Occluder-Occludee Relational Network for Robust Apple Detection in Clustered Orchard EnvironmentsPengyu Chu, Zhaojian Li, Kaixiang Zhang et al.
Automated apple harvesting has attracted significant research interest in recent years due to its potential to revolutionize the apple industry, addressing the issues of shortage and high costs in labor. One key technology to fully enable efficient automated harvesting is accurate and robust apple detection, which is challenging due to complex orchard environments that involve varying lighting conditions and foliage/branch occlusions. Furthermore, clustered apples are common in the orchard, which brings additional challenges as the clustered apples may be identified as one apple. This will cause issues in localization for subsequent robotic operations. In this paper, we present the development of a novel deep learning-based apple detection framework, Occluder-Occludee Relational Network (O2RNet), for robust detection of apples in such clustered environments. This network exploits the occuluder-occludee relationship modeling head by introducing a feature expansion structure to enable the combination of layered traditional detectors to split clustered apples and foliage occlusions. More specifically, we collect a comprehensive apple orchard image dataset under different lighting conditions (overcast, front lighting, and back lighting) with frequent apple occlusions. We then develop a novel occlusion-aware network for apple detection, in which a feature expansion structure is incorporated into the convolutional neural networks to extract additional features generated by the original network for occluded apples. Comprehensive evaluations are performed, which show that the developed O2RNet outperforms state-of-the-art models with a higher accuracy of 94\% and a higher F1-score of 0.88 on apple detection.
CVMay 14, 2024Code
MetaFruit Meets Foundation Models: Leveraging a Comprehensive Multi-Fruit Dataset for Advancing Agricultural Foundation ModelsJiajia Li, Kyle Lammers, Xunyuan Yin et al.
Fruit harvesting poses a significant labor and financial burden for the industry, highlighting the critical need for advancements in robotic harvesting solutions. Machine vision-based fruit detection has been recognized as a crucial component for robust identification of fruits to guide robotic manipulation. Despite considerable progress in leveraging deep learning and machine learning techniques for fruit detection, a common shortfall is the inability to swiftly extend the developed models across different orchards and/or various fruit species. Additionally, the limited availability of pertinent data further compounds these challenges. In this work, we introduce MetaFruit, the largest publicly available multi-class fruit dataset, comprising 4,248 images and 248,015 manually labeled instances across diverse U.S. orchards. Furthermore, this study proposes an innovative open-set fruit detection system leveraging advanced Vision Foundation Models (VFMs) for fruit detection that can adeptly identify a wide array of fruit types under varying orchard conditions. This system not only demonstrates remarkable adaptability in learning from minimal data through few-shot learning but also shows the ability to interpret human instructions for subtle detection tasks. The performance of the developed foundation model is comprehensively evaluated using several metrics, which outperforms the existing state-of-the-art algorithms in both our MetaFruit dataset and other open-sourced fruit datasets, thereby setting a new benchmark in the field of agricultural technology and robotic harvesting. The MetaFruit dataset and detection framework are open-sourced to foster future research in vision-based fruit harvesting, marking a significant stride toward addressing the urgent needs of the agricultural sector.
ROOct 21, 2020
System Design and Control of an Apple Harvesting RobotKaixiang Zhang, Kyle Lammers, Pengyu Chu et al.
There is a growing need for robotic apple harvesting due to decreasing availability and rising cost in labor. Towards the goal of developing a viable robotic system for apple harvesting, this paper presents synergistic mechatronic design and motion control of a robotic apple harvesting prototype, which lays a critical foundation for future advancements. Specifically, we develop a deep learning-based fruit detection and localization system using an RGB-D camera. A three degree-of-freedom manipulator is then designed with a hybrid pneumatic/motor actuation mechanism to achieve fast and dexterous movements. A vacuum-based end-effector is used for apple detaching. These three components are integrated into a robotic apple harvesting prototype with simplicity, compactness, and robustness. Moreover, a nonlinear velocity-based control scheme is developed for the manipulator to achieve accurate and agile motion control. Test experiments are conducted to demonstrate the performance of the developed apple harvesting robot.
CVOct 19, 2020
DeepApple: Deep Learning-based Apple Detection using a Suppression Mask R-CNNPengyu Chu, Zhaojian Li, Kyle Lammers et al.
Robotic apple harvesting has received much research attention in the past few years due to growing shortage and rising cost in labor. One key enabling technology towards automated harvesting is accurate and robust apple detection, which poses great challenges as a result of the complex orchard environment that involves varying lighting conditions and foliage/branch occlusions. This letter reports on the development of a novel deep learning-based apple detection framework named DeepApple. Specifically, we first collect a comprehensive apple orchard dataset for 'Gala' and 'Blondee' apples, using a color camera, under different lighting conditions (sunny vs. overcast and front lighting vs. back lighting). We then develop a novel suppression Mask R-CNN for apple detection, in which a suppression branch is added to the standard Mask R-CNN to suppress non-apple features generated by the original network. Comprehensive evaluations are performed, which show that the developed suppression Mask R-CNN network outperforms state-of-the-art models with a higher F1-score of 0.905 and a detection time of 0.25 second per frame on a standard desktop computer.