CVNov 27, 2019

Towards Precise End-to-end Weakly Supervised Object Detection Network

arXiv:1911.12148v1141 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of inaccurate object localization in weakly supervised detection for computer vision researchers, presenting an incremental improvement over existing two-phase methods.

The paper tackles the challenge of precisely predicting object positions in weakly supervised object detection by jointly training multiple instance learning and bounding-box regression in an end-to-end network, achieving state-of-the-art performance on public datasets.

It is challenging for weakly supervised object detection network to precisely predict the positions of the objects, since there are no instance-level category annotations. Most existing methods tend to solve this problem by using a two-phase learning procedure, i.e., multiple instance learning detector followed by a fully supervised learning detector with bounding-box regression. Based on our observation, this procedure may lead to local minima for some object categories. In this paper, we propose to jointly train the two phases in an end-to-end manner to tackle this problem. Specifically, we design a single network with both multiple instance learning and bounding-box regression branches that share the same backbone. Meanwhile, a guided attention module using classification loss is added to the backbone for effectively extracting the implicit location information in the features. Experimental results on public datasets show that our method achieves state-of-the-art performance.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes