CVNov 18, 2019

Dont Even Look Once: Synthesizing Features for Zero-Shot Detection

Pengkai Zhu, Hanxiao Wang, Venkatesh Saligrama

arXiv:1911.07933v317.1101 citations

Originality Highly original

AI Analysis

This addresses the scalability issue in large-scale applications with many object classes by enabling detection without extensive annotated data, though it is an incremental advance in zero-shot detection methods.

The paper tackles the problem of zero-shot detection, where models must localize both seen and unseen objects, by proposing DELO, a novel algorithm that synthesizes visual features for unseen objects and augments training. It demonstrates significant improvements in test accuracy on Pascal VOC and MSCOCO datasets over vanilla and state-of-the-art zero-shot detectors.

Zero-shot detection, namely, localizing both seen and unseen objects, increasingly gains importance for large-scale applications, with large number of object classes, since, collecting sufficient annotated data with ground truth bounding boxes is simply not scalable. While vanilla deep neural networks deliver high performance for objects available during training, unseen object detection degrades significantly. At a fundamental level, while vanilla detectors are capable of proposing bounding boxes, which include unseen objects, they are often incapable of assigning high-confidence to unseen objects, due to the inherent precision/recall tradeoffs that requires rejecting background objects. We propose a novel detection algorithm Dont Even Look Once (DELO), that synthesizes visual features for unseen objects and augments existing training algorithms to incorporate unseen object detection. Our proposed scheme is evaluated on Pascal VOC and MSCOCO, and we demonstrate significant improvements in test accuracy over vanilla and other state-of-art zero-shot detectors

View on arXiv PDF

Similar