CVMar 30, 2021

DAP: Detection-Aware Pre-training with Weak Supervision

arXiv:2103.16651v116 citations
Originality Incremental advance
AI Analysis

This addresses the need for more efficient pre-training methods in computer vision, particularly for object detection tasks, by leveraging existing datasets without requiring expensive bounding box annotations, though it is incremental as it builds on existing weak supervision techniques.

The paper tackles the problem of pre-training for object detection by introducing a detection-aware pre-training (DAP) approach that uses weakly-labeled classification datasets to make models location-aware, resulting in improved sample efficiency and convergence speed, with large accuracy boosts in low-data scenarios on VOC and COCO benchmarks.

This paper presents a detection-aware pre-training (DAP) approach, which leverages only weakly-labeled classification-style datasets (e.g., ImageNet) for pre-training, but is specifically tailored to benefit object detection tasks. In contrast to the widely used image classification-based pre-training (e.g., on ImageNet), which does not include any location-related training tasks, we transform a classification dataset into a detection dataset through a weakly supervised object localization method based on Class Activation Maps to directly pre-train a detector, making the pre-trained model location-aware and capable of predicting bounding boxes. We show that DAP can outperform the traditional classification pre-training in terms of both sample efficiency and convergence speed in downstream detection tasks including VOC and COCO. In particular, DAP boosts the detection accuracy by a large margin when the number of examples in the downstream task is small.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes