Object Discovery via Contrastive Learning for Weakly Supervised Object Detection
This work addresses the challenge of detecting multiple object instances in images without detailed annotations, which is crucial for applications like automated image analysis, but it is incremental as it builds on existing self-supervised methods.
The paper tackles the problem of weakly supervised object detection, where models are trained only with image-level annotations, by proposing a novel multiple instance labeling method called object discovery and a new contrastive loss (WSCL) to improve detection of multiple object instances. The result is new state-of-the-art performance on MS-COCO 2014, MS-COCO 2017, and PASCAL VOC 2012, with competitive results on PASCAL VOC 2007.
Weakly Supervised Object Detection (WSOD) is a task that detects objects in an image using a model trained only on image-level annotations. Current state-of-the-art models benefit from self-supervised instance-level supervision, but since weak supervision does not include count or location information, the most common ``argmax'' labeling method often ignores many instances of objects. To alleviate this issue, we propose a novel multiple instance labeling method called object discovery. We further introduce a new contrastive loss under weak supervision where no instance-level information is available for sampling, called weakly supervised contrastive loss (WSCL). WSCL aims to construct a credible similarity threshold for object discovery by leveraging consistent features for embedding vectors in the same class. As a result, we achieve new state-of-the-art results on MS-COCO 2014 and 2017 as well as PASCAL VOC 2012, and competitive results on PASCAL VOC 2007.