CVLGJul 11, 2022

Scaling Novel Object Detection with Weakly Supervised Detection Transformers

arXiv:2207.05205v316 citationsh-index: 34Has Code
Originality Highly original
AI Analysis

This addresses the practical limitation of current weakly supervised object detection methods, which are constrained to small data scales and require multiple training rounds, for researchers and practitioners needing scalable novel object detection.

The paper tackles the problem of finetuning object detectors for novel objects without expensive bounding box annotations by proposing a Weakly Supervised Detection Transformer that enables efficient knowledge transfer from large-scale pretraining to weakly supervised finetuning on hundreds of novel objects. The approach outperforms previous state-of-the-art models on large-scale datasets and shows that class quantity is more important than image quantity for pretraining.

A critical object detection task is finetuning an existing model to detect novel objects, but the standard workflow requires bounding box annotations which are time-consuming and expensive to collect. Weakly supervised object detection (WSOD) offers an appealing alternative, where object detectors can be trained using image-level labels. However, the practical application of current WSOD models is limited, as they only operate at small data scales and require multiple rounds of training and refinement. To address this, we propose the Weakly Supervised Detection Transformer, which enables efficient knowledge transfer from a large-scale pretraining dataset to WSOD finetuning on hundreds of novel objects. Additionally, we leverage pretrained knowledge to improve the multiple instance learning (MIL) framework often used in WSOD methods. Our experiments show that our approach outperforms previous state-of-the-art models on large-scale novel object detection datasets, and our scaling study reveals that class quantity is more important than image quantity for WSOD pretraining. The code is available at https://github.com/tmlabonte/weakly-supervised-DETR.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes