CVMay 26

COVD: Continual Open-Vocabulary Object Detection with Novel Concept Injection

arXiv:2605.2711660.4
Predicted impact top 57% in CV · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses the practical problem of evolving category spaces in open-vocabulary detection, offering an efficient solution for continual learning without full retraining.

COVD introduces a new task for continual open-vocabulary object detection where models sequentially learn novel concepts without forgetting prior knowledge. The proposed NoIn-Det method freezes the visual encoder and updates only a small subset of text-branch parameters, outperforming existing continual learning methods for VLMs without adding parameters.

Open-vocabulary object detection (OVD) has made significant progress, enabling detectors to generalize from seen to unseen categories. However, real-world category spaces continually evolve, and existing OVD models still struggle with newly emerging concepts, while repeated full retraining is prohibitively expensive. To this end, we introduce a new task setting, termed Continual OVD with Novel Concept Injection (COVD), where models sequentially learn incoming novel concept groups while preserving prior concepts and original open-vocabulary knowledge, along with a new benchmark, Novel-114. Our key observation is that pretrained visual encoders often already perceive and represent many novel concepts, and the main bottleneck lies in the lack of stable semantic alignment between visual representations and textual concepts. Based on this, we propose NoIn-Det, an efficient continual injection framework without additional parameters. NoIn-Det freezes the visual encoder, preserves the text representation space using only texts of common concepts and previously injected concepts, and injects novel concepts by updating only a small subset of text-branch parameters beneficial to novel concept learning. Extensive experiments show that NoIn-Det effectively learns novel concepts, preserves old knowledge, and consistently outperforms existing continual learning methods for VLMs without introducing additional parameters.Novel-114 and the code will be released.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes