CVLGSep 18, 2025

Synthetic-to-Real Object Detection using YOLOv11 and Domain Randomization Strategies

arXiv:2509.15045v1h-index: 1
Originality Incremental advance
AI Analysis

This addresses the problem of reducing reliance on real-world data for object detection, though it is incremental as it builds on existing YOLO and domain randomization methods.

This paper tackled the synthetic-to-real domain gap in object detection by training a YOLOv11 model on synthetic data with domain randomization to detect soup cans, achieving a final mAP@50 of 0.910 on a real-world test set.

This paper addresses the synthetic-to-real domain gap in object detection, focusing on training a YOLOv11 model to detect a specific object (a soup can) using only synthetic data and domain randomization strategies. The methodology involves extensive experimentation with data augmentation, dataset composition, and model scaling. While synthetic validation metrics were consistently high, they proved to be poor predictors of real-world performance. Consequently, models were also evaluated qualitatively, through visual inspection of predictions, and quantitatively, on a manually labeled real-world test set, to guide development. Final mAP@50 scores were provided by the official Kaggle competition. Key findings indicate that increasing synthetic dataset diversity, specifically by including varied perspectives and complex backgrounds, combined with carefully tuned data augmentation, were crucial in bridging the domain gap. The best performing configuration, a YOLOv11l model trained on an expanded and diverse dataset, achieved a final mAP@50 of 0.910 on the competition's hidden test set. This result demonstrates the potential of a synthetic-only training approach while also highlighting the remaining challenges in fully capturing real-world variability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes