CVLGNov 9, 2020

A Broad Dataset is All You Need for One-Shot Object Detection

arXiv:2011.04267v23 citations
AI Analysis

This addresses the problem of poor generalization to novel categories in few-shot object detection for computer vision researchers, offering a simple scaling approach rather than complex methods.

The paper tackles the generalization gap in one-shot object detection by showing that increasing the number of object categories during training nearly closes this gap, improving generalization from seen to unseen classes from 45% to 89% and boosting state-of-the-art on COCO by 5.4% AP50.

Is it possible to detect arbitrary objects from a single example? A central problem of all existing attempts at one-shot object detection is the generalization gap: Object categories used during training are detected much more reliably than novel ones. We here show that this generalization gap can be nearly closed by increasing the number of object categories used during training. Doing so allows us to improve generalization from seen to unseen classes from 45% to 89% and improve the state-of-the-art on COCO by 5.4 %AP50 (from 22.0 to 27.5). We verify that the effect is caused by the number of categories and not the number of training samples, and that it holds for different models, backbones and datasets. This result suggests that the key to strong few-shot detection models may not lie in sophisticated metric learning approaches, but instead simply in scaling the number of categories. We hope that our findings will help to better understand the challenges of few-shot learning and encourage future data annotation efforts to focus on wider datasets with a broader set of categories rather than gathering more samples per category.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes