CV IVAug 21, 2020

Many-shot from Low-shot: Learning to Annotate using Mixed Supervision for Object Detection

Carlo Biffi, Steven McDonagh, Philip Torr, Ales Leonardis, Sarah Parisot

arXiv:2008.09694v24.213 citations

Originality Highly original

AI Analysis

This addresses the high cost of annotation for object detection tasks, offering a novel end-to-end solution that reduces reliance on fully annotated datasets.

The paper tackles the problem of expensive manual annotation for object detection by introducing an online annotation module that learns to generate reliable many-shot annotations from weakly labeled images, improving Faster R-CNN performance by 17% mAP and 9% AP50 on PASCAL VOC 2007 and MS-COCO benchmarks.

Object detection has witnessed significant progress by relying on large, manually annotated datasets. Annotating such datasets is highly time consuming and expensive, which motivates the development of weakly supervised and few-shot object detection methods. However, these methods largely underperform with respect to their strongly supervised counterpart, as weak training signals \emph{often} result in partial or oversized detections. Towards solving this problem we introduce, for the first time, an online annotation module (OAM) that learns to generate a many-shot set of \emph{reliable} annotations from a larger volume of weakly labelled images. Our OAM can be jointly trained with any fully supervised two-stage object detection method, providing additional training annotations on the fly. This results in a fully end-to-end strategy that only requires a low-shot set of fully annotated images. The integration of the OAM with Fast(er) R-CNN improves their performance by $17\%$ mAP, $9\%$ AP50 on PASCAL VOC 2007 and MS-COCO benchmarks, and significantly outperforms competing methods using mixed supervision.

View on arXiv PDF

Similar