CVAILGOct 21, 2020

UFO$^2$: A Unified Framework towards Omni-supervised Object Detection

arXiv:2010.10804v144 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of costly and varied annotations in real-world object detection, offering a flexible solution for budget-aware training.

The paper tackles the problem of object detection with diverse annotation forms by introducing UFO^2, a unified framework that simultaneously handles strong supervision, partial supervision, and unlabeled data, achieving competitive performance without requiring bounding box annotations for all data and detecting over 1,000 objects.

Existing work on object detection often relies on a single form of annotation: the model is trained using either accurate yet costly bounding boxes or cheaper but less expressive image-level tags. However, real-world annotations are often diverse in form, which challenges these existing works. In this paper, we present UFO$^2$, a unified object detection framework that can handle different forms of supervision simultaneously. Specifically, UFO$^2$ incorporates strong supervision (e.g., boxes), various forms of partial supervision (e.g., class tags, points, and scribbles), and unlabeled data. Through rigorous evaluations, we demonstrate that each form of label can be utilized to either train a model from scratch or to further improve a pre-trained model. We also use UFO$^2$ to investigate budget-aware omni-supervised learning, i.e., various annotation policies are studied under a fixed annotation budget: we show that competitive performance needs no strong labels for all data. Finally, we demonstrate the generalization of UFO$^2$, detecting more than 1,000 different objects without bounding box annotations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes