CV AIApr 3, 2018

DOCK: Detecting Objects by transferring Common-sense Knowledge

Krishna Kumar Singh, Santosh Divvala, Ali Farhadi, Yong Jae Lee

arXiv:1804.01077v210.731 citations

Originality Incremental advance

AI Analysis

This work addresses a domain-specific problem in computer vision for researchers and practitioners, offering an incremental advancement by leveraging region-level similarity and common-sense cues.

The paper tackles the problem of object detection for target categories with only image-level annotations by transferring common-sense knowledge from source categories with bounding box annotations, resulting in substantial performance improvements on the MS COCO dataset over existing baselines.

We present a scalable approach for Detecting Objects by transferring Common-sense Knowledge (DOCK) from source to target categories. In our setting, the training data for the source categories have bounding box annotations, while those for the target categories only have image-level annotations. Current state-of-the-art approaches focus on image-level visual or semantic similarity to adapt a detector trained on the source categories to the new target categories. In contrast, our key idea is to (i) use similarity not at the image-level, but rather at the region-level, and (ii) leverage richer common-sense (based on attribute, spatial, etc.) to guide the algorithm towards learning the correct detections. We acquire such common-sense cues automatically from readily-available knowledge bases without any extra human effort. On the challenging MS COCO dataset, we find that common-sense knowledge can substantially improve detection performance over existing transfer-learning baselines.

View on arXiv PDF

Similar