Practical Insights into Semi-Supervised Object Detection Approaches
This work provides practical insights for researchers and practitioners in computer vision on selecting methods for low-data object detection scenarios, but it is incremental as it focuses on comparative analysis without introducing new techniques.
The paper compared three state-of-the-art semi-supervised object detection methods (MixPL, Semi-DETR, Consistent-Teacher) to analyze performance variations with labeled image counts, using MS-COCO, Pascal VOC, and a custom Beetle dataset, finding trade-offs in accuracy, model size, and latency.
Learning in data-scarce settings has recently gained significant attention in the research community. Semi-supervised object detection(SSOD) aims to improve detection performance by leveraging a large number of unlabeled images alongside a limited number of labeled images(a.k.a.,few-shot learning). In this paper, we present a comprehensive comparison of three state-of-the-art SSOD approaches, including MixPL, Semi-DETR and Consistent-Teacher, with the goal of understanding how performance varies with the number of labeled images. We conduct experiments using the MS-COCO and Pascal VOC datasets, two popular object detection benchmarks which allow for standardized evaluation. In addition, we evaluate the SSOD approaches on a custom Beetle dataset which enables us to gain insights into their performance on specialized datasets with a smaller number of object categories. Our findings highlight the trade-offs between accuracy, model size, and latency, providing insights into which methods are best suited for low-data regimes.