Incrementally Zero-Shot Detection by an Extreme Value Analyzer
This addresses a practical problem for real-world object detection systems that need to handle both novel unseen classes and incremental updates to existing knowledge.
The paper tackles the combined challenge of zero-shot and incremental learning in object detection by introducing the Incrementally Zero-Shot Detection (IZSD) task and proposing the IZSD-EVer model, which outperforms alternative models on Pascal VOC and MSCOCO datasets.
Human beings not only have the ability to recognize novel unseen classes, but also can incrementally incorporate the new classes to existing knowledge preserved. However, zero-shot learning models assume that all seen classes should be known beforehand, while incremental learning models cannot recognize unseen classes. This paper introduces a novel and challenging task of Incrementally Zero-Shot Detection (IZSD), a practical strategy for both zero-shot learning and class-incremental learning in real-world object detection. An innovative end-to-end model -- IZSD-EVer was proposed to tackle this task that requires incrementally detecting new classes and detecting the classes that have never been seen. Specifically, we propose a novel extreme value analyzer to detect objects from old seen, new seen, and unseen classes, simultaneously. Additionally and technically, we propose two innovative losses, i.e., background-foreground mean squared error loss alleviating the extreme imbalance of the background and foreground of images, and projection distance loss aligning the visual space and semantic spaces of old seen classes. Experiments demonstrate the efficacy of our model in detecting objects from both the seen and unseen classes, outperforming the alternative models on Pascal VOC and MSCOCO datasets.