CVAug 21, 2017

Revisiting knowledge transfer for training object class detectors

Jasper Uijlings, Stefan Popov, Vittorio Ferrari

arXiv:1708.06128v320.673 citations

Originality Incremental advance

AI Analysis

This improves object detection for computer vision applications by enabling efficient training with less annotation, though it is incremental as it builds on existing knowledge transfer methods.

The paper tackles training object detectors for target classes using weakly supervised images, aided by source classes with bounding-box annotations, resulting in significant performance gains, such as 70.3% CorLoc and 36.9% mAP on ILSVRC 2013, outperforming baselines and reaching 80% of fully supervised mAP.

We propose to revisit knowledge transfer for training object detectors on target classes from weakly supervised training images, helped by a set of source classes with bounding-box annotations. We present a unified knowledge transfer framework based on training a single neural network multi-class object detector over all source classes, organized in a semantic hierarchy. This generates proposals with scores at multiple levels in the hierarchy, which we use to explore knowledge transfer over a broad range of generality, ranging from class-specific (bicycle to motorbike) to class-generic (objectness to any class). Experiments on the 200 object classes in the ILSVRC 2013 detection dataset show that our technique: (1) leads to much better performance on the target classes (70.3% CorLoc, 36.9% mAP) than a weakly supervised baseline which uses manually engineered objectness [11] (50.5% CorLoc, 25.4% mAP). (2) delivers target object detectors reaching 80% of the mAP of their fully supervised counterparts. (3) outperforms the best reported transfer learning results on this dataset (+41% CorLoc and +3% mAP over [18, 46], +16.2% mAP over [32]). Moreover, we also carry out several across-dataset knowledge transfer experiments [27, 24, 35] and find that (4) our technique outperforms the weakly supervised baseline in all dataset pairs by 1.5x-1.9x, establishing its general applicability.

View on arXiv PDF

Similar