Continual Learning in Open-vocabulary Classification with Complementary Memory Systems
This addresses the challenge of flexible and efficient continual learning for image classification, but it is incremental as it builds on existing CLIP and exemplar-based methods.
The paper tackles the problem of continual learning in open-vocabulary image classification by combining a CLIP zero-shot model with an exemplar-based model, achieving a balance of learning speed, target task effectiveness, and zero-shot effectiveness across incremental settings.
We introduce a method for flexible and efficient continual learning in open-vocabulary image classification, drawing inspiration from the complementary learning systems observed in human cognition. Specifically, we propose to combine predictions from a CLIP zero-shot model and the exemplar-based model, using the zero-shot estimated probability that a sample's class is within the exemplar classes. We also propose a "tree probe" method, an adaption of lazy learning principles, which enables fast learning from new examples with competitive accuracy to batch-trained linear models. We test in data incremental, class incremental, and task incremental settings, as well as ability to perform flexible inference on varying subsets of zero-shot and learned categories. Our proposed method achieves a good balance of learning speed, target task effectiveness, and zero-shot effectiveness. Code will be available at https://github.com/jessemelpolio/TreeProbe.