Extending One-Stage Detection with Open-World Proposals
This work addresses the challenge of detecting unseen objects in applications like autonomous driving, offering an incremental improvement by adapting one-stage detection for open-world scenarios.
The paper tackles the problem of open-world detection, where object detection must generalize to unseen classes, by extending a one-stage detection network (FCOS) with architectural and sampling optimizations to generate open-world proposals. The result is a 6% increase in recall on novel classes, achieving comparable performance to two-stage networks while retaining classification performance better, with only a 2% drop compared to 6% in two-stage methods.
In many applications, such as autonomous driving, hand manipulation, or robot navigation, object detection methods must be able to detect objects unseen in the training set. Open World Detection(OWD) seeks to tackle this problem by generalizing detection performance to seen and unseen class categories. Recent works have seen success in the generation of class-agnostic proposals, which we call Open-World Proposals(OWP), but this comes at the cost of a big drop on the classification task when both tasks are considered in the detection model. These works have investigated two-stage Region Proposal Networks (RPN) by taking advantage of objectness scoring cues; however, for its simplicity, run-time, and decoupling of localization and classification, we investigate OWP through the lens of fully convolutional one-stage detection network, such as FCOS. We show that our architectural and sampling optimizations on FCOS can increase OWP performance by as much as 6% in recall on novel classes, marking the first proposal-free one-stage detection network to achieve comparable performance to RPN-based two-stage networks. Furthermore, we show that the inherent, decoupled architecture of FCOS has benefits to retaining classification performance. While two-stage methods worsen by 6% in recall on novel classes, we show that FCOS only drops 2% when jointly optimizing for OWP and classification.