CVMar 15, 2024

CoLeCLIP: Open-Domain Continual Learning via Joint Task Prompt and Vocabulary Learning

arXiv:2403.10245v117 citationsh-index: 10IEEE Trans Neural Netw Learn Syst
Originality Highly original
AI Analysis

This addresses the challenge of catastrophic forgetting in open-domain continual learning for applications like AI assistants and autonomous systems, representing a novel extension beyond closed-set scenarios.

The paper tackles the problem of continual learning for vision-language models in open domains with diverse datasets and novel classes, introducing CoLeCLIP which jointly learns task prompts and a cross-domain class vocabulary, and it outperforms state-of-the-art methods on 11 domain datasets.

This paper explores the problem of continual learning (CL) of vision-language models (VLMs) in open domains, where the models need to perform continual updating and inference on a streaming of datasets from diverse seen and unseen domains with novel classes. Such a capability is crucial for various applications in open environments, e.g., AI assistants, autonomous driving systems, and robotics. Current CL studies mostly focus on closed-set scenarios in a single domain with known classes. Large pre-trained VLMs like CLIP have demonstrated superior zero-shot recognition ability, and a number of recent studies leverage this ability to mitigate catastrophic forgetting in CL, but they focus on closed-set CL in a single domain dataset. Open-domain CL of large VLMs is significantly more challenging due to 1) large class correlations and domain gaps across the datasets and 2) the forgetting of zero-shot knowledge in the pre-trained VLMs in addition to the knowledge learned from the newly adapted datasets. In this work we introduce a novel approach, termed CoLeCLIP, that learns an open-domain CL model based on CLIP. It addresses these challenges by a joint learning of a set of task prompts and a cross-domain class vocabulary. Extensive experiments on 11 domain datasets show that CoLeCLIP outperforms state-of-the-art methods for open-domain CL under both task- and class-incremental learning settings.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes