Comprehensive Performance Evaluation of YOLOv12, YOLO11, YOLOv10, YOLOv9 and YOLOv8 on Detecting and Counting Fruitlet in Complex Orchard Environments
This work addresses fruitlet detection and counting in complex orchard environments for agricultural applications, but it is incremental as it compares existing methods on new data.
This study evaluated YOLO-based object detection algorithms for detecting and counting fruitlets in orchards, finding that YOLOv12l achieved the highest recall (0.900), YOLOv10x and YOLOv9 GELAN-c had top precision (0.908 and 0.903), and YOLO11n showed superior counting accuracy with RMSE values of 4.51-4.96 and the fastest inference speed (2.4 ms).
This study systematically conducted an extensive real-world evaluation of all configurations of You Only Look Once (YOLO)-based object detection algorithms, including YOLOv8, YOLOv9, YOLOv10, YOLO11, and YOLOv12. Models were assessed using precision, recall, mean Average Precision at 50 % Intersection over Union (mAP@50), and computational efficiency across pre-processing, inference, and post-processing stages for detecting immature green fruitlets in commercial orchards. Field-level fruitlet counting was also validated using images captured with both Intel RealSense and iPhone 14 Pro Max sensors. YOLOv12l achieved the highest recall (0.900), while YOLOv10x and YOLOv9 GELAN-c reported the top precision scores of 0.908 and 0.903, respectively. YOLOv9 GELAN-base and GELAN-e achieved the highest mAP@50 (0.935), followed by YOLO11s (0.933) and YOLOv12l (0.931). In counting validation, YOLO11n demonstrated superior accuracy, with RMSE values of 4.51-4.96 and MAE values of 3.85-7.73 across four apple varieties. Sensor-specific training on Intel RealSense further improved detection performance. YOLO11n also recorded the fastest inference speed (2.4 ms), outperforming YOLOv8n, YOLOv9 GELAN-s, YOLOv10n, and YOLOv12n, affirming its suitability for real-time orchard applications.