Real-time object detection and robotic manipulation for agriculture using a YOLO-based learning approach
This work addresses the need for improved automation in agriculture, specifically for crop harvesting, but it is incremental as it combines existing CNN architectures in a simulated setting.
The study tackled the problem of automating crop harvesting by developing a framework that simultaneously detects crops and determines robotic grasping positions in a simulated environment, achieving enhanced harvesting efficiency through a YOLO-based approach with data augmentation.
The optimisation of crop harvesting processes for commonly cultivated crops is of great importance in the aim of agricultural industrialisation. Nowadays, the utilisation of machine vision has enabled the automated identification of crops, leading to the enhancement of harvesting efficiency, but challenges still exist. This study presents a new framework that combines two separate architectures of convolutional neural networks (CNNs) in order to simultaneously accomplish the tasks of crop detection and harvesting (robotic manipulation) inside a simulated environment. Crop images in the simulated environment are subjected to random rotations, cropping, brightness, and contrast adjustments to create augmented images for dataset generation. The you only look once algorithmic framework is employed with traditional rectangular bounding boxes for crop localization. The proposed method subsequently utilises the acquired image data via a visual geometry group model in order to reveal the grasping positions for the robotic manipulation.