ODverse33: Is the New YOLO Version Always Better? A Multi Domain benchmark from YOLO v5 to v11
This study provides guidance for users of object detection models by evaluating real-world performance across diverse domains, though it is incremental as it benchmarks existing methods without introducing new algorithms.
The paper tackled the problem of whether newer YOLO versions consistently outperform older ones by introducing ODverse33, a benchmark with 33 datasets across 11 domains, and found that performance gains vary across domains, with specific improvements in metrics like mAP and FPS.
You Look Only Once (YOLO) models have been widely used for building real-time object detectors across various domains. With the increasing frequency of new YOLO versions being released, key questions arise. Are the newer versions always better than their previous versions? What are the core innovations in each YOLO version and how do these changes translate into real-world performance gains? In this paper, we summarize the key innovations from YOLOv1 to YOLOv11, introduce a comprehensive benchmark called ODverse33, which includes 33 datasets spanning 11 diverse domains (Autonomous driving, Agricultural, Underwater, Medical, Videogame, Industrial, Aerial, Wildlife, Retail, Microscopic, and Security), and explore the practical impact of model improvements in real-world, multi-domain applications through extensive experimental results. We hope this study can provide some guidance to the extensive users of object detection models and give some references for future real-time object detector development.