CVSep 26, 2024

Search and Detect: Training-Free Long Tail Object Detection via Web-Image Retrieval

arXiv:2409.18733v14 citationsh-index: 11
Originality Highly original
AI Analysis

This work addresses the challenge of detecting rare objects in images for computer vision applications, offering a potentially cost-effective alternative to data annotation and training.

The paper tackles the problem of long-tail object detection by introducing SearchDet, a training-free framework that uses web-image retrieval to enhance open-vocabulary detection, achieving improvements of 48.7% mAP on ODinW and 59.1% mAP on LVIS compared to state-of-the-art models.

In this paper, we introduce SearchDet, a training-free long-tail object detection framework that significantly enhances open-vocabulary object detection performance. SearchDet retrieves a set of positive and negative images of an object to ground, embeds these images, and computes an input image-weighted query which is used to detect the desired concept in the image. Our proposed method is simple and training-free, yet achieves over 48.7% mAP improvement on ODinW and 59.1% mAP improvement on LVIS compared to state-of-the-art models such as GroundingDINO. We further show that our approach of basing object detection on a set of Web-retrieved exemplars is stable with respect to variations in the exemplars, suggesting a path towards eliminating costly data annotation and training procedures.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes