CVIVAug 21, 2020

Many-shot from Low-shot: Learning to Annotate using Mixed Supervision for Object Detection

arXiv:2008.09694v213 citations
AI Analysis

This addresses the high cost of annotation for object detection tasks, offering a novel end-to-end solution that reduces reliance on fully annotated datasets.

The paper tackles the problem of expensive manual annotation for object detection by introducing an online annotation module that learns to generate reliable many-shot annotations from weakly labeled images, improving Faster R-CNN performance by 17% mAP and 9% AP50 on PASCAL VOC 2007 and MS-COCO benchmarks.

Object detection has witnessed significant progress by relying on large, manually annotated datasets. Annotating such datasets is highly time consuming and expensive, which motivates the development of weakly supervised and few-shot object detection methods. However, these methods largely underperform with respect to their strongly supervised counterpart, as weak training signals \emph{often} result in partial or oversized detections. Towards solving this problem we introduce, for the first time, an online annotation module (OAM) that learns to generate a many-shot set of \emph{reliable} annotations from a larger volume of weakly labelled images. Our OAM can be jointly trained with any fully supervised two-stage object detection method, providing additional training annotations on the fly. This results in a fully end-to-end strategy that only requires a low-shot set of fully annotated images. The integration of the OAM with Fast(er) R-CNN improves their performance by $17\%$ mAP, $9\%$ AP50 on PASCAL VOC 2007 and MS-COCO benchmarks, and significantly outperforms competing methods using mixed supervision.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes