CVJun 27, 2024

HUWSOD: Holistic Self-training for Unified Weakly Supervised Object Detection

arXiv:2406.19394v1
Originality Incremental advance
AI Analysis

This work addresses the challenge of improving training stability and reducing reliance on external components in WSOD, which is important for researchers and practitioners in computer vision, though it is incremental as it builds on existing WSOD frameworks.

The paper tackles the problem of unstable training in weakly supervised object detection (WSOD) by introducing HUWSOD, a unified network that eliminates the need for traditional object proposals and external modules, achieving competitive performance with state-of-the-art WSOD methods on PASCAL VOC and MS COCO datasets, with peak performance approaching that of fully-supervised Faster R-CNN.

Most WSOD methods rely on traditional object proposals to generate candidate regions and are confronted with unstable training, which easily gets stuck in a poor local optimum. In this paper, we introduce a unified, high-capacity weakly supervised object detection (WSOD) network called HUWSOD, which utilizes a comprehensive self-training framework without needing external modules or additional supervision. HUWSOD innovatively incorporates a self-supervised proposal generator and an autoencoder proposal generator with a multi-rate resampling pyramid to replace traditional object proposals, enabling end-to-end WSOD training and inference. Additionally, we implement a holistic self-training scheme that refines detection scores and coordinates through step-wise entropy minimization and consistency-constraint regularization, ensuring consistent predictions across stochastic augmentations of the same image. Extensive experiments on PASCAL VOC and MS COCO demonstrate that HUWSOD competes with state-of-the-art WSOD methods, eliminating the need for offline proposals and additional data. The peak performance of HUWSOD approaches that of fully-supervised Faster R-CNN. Our findings also indicate that randomly initialized boxes, although significantly different from well-designed offline object proposals, are effective for WSOD training.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes