Contrastive Supervised Distillation for Continual Representation Learning
This addresses the problem of forgetting in sequential learning for visual retrieval, though it appears incremental as it builds on existing distillation and contrastive learning approaches.
The paper tackles catastrophic forgetting in continual representation learning for visual search tasks by proposing Contrastive Supervised Distillation (CSD), which reduces feature forgetting and outperforms state-of-the-art methods.
In this paper, we propose a novel training procedure for the continual representation learning problem in which a neural network model is sequentially learned to alleviate catastrophic forgetting in visual search tasks. Our method, called Contrastive Supervised Distillation (CSD), reduces feature forgetting while learning discriminative features. This is achieved by leveraging labels information in a distillation setting in which the student model is contrastively learned from the teacher model. Extensive experiments show that CSD performs favorably in mitigating catastrophic forgetting by outperforming current state-of-the-art methods. Our results also provide further evidence that feature forgetting evaluated in visual retrieval tasks is not as catastrophic as in classification tasks. Code at: https://github.com/NiccoBiondi/ContrastiveSupervisedDistillation.