CV AIMar 11, 2024

CEAT: Continual Expansion and Absorption Transformer for Non-Exemplar Class-Incremental Learning

Xinyuan Gao, Songlin Dong, Yuhang He, Xing Wei, Yihong Gong

arXiv:2403.06670v28.711 citationsh-index: 14IEEE transactions on circuits and systems for video technology (Print)

Originality Highly original

AI Analysis

This addresses the challenge of continual learning in privacy-sensitive scenarios, offering a novel architecture to mitigate forgetting and classifier bias without storing old data.

The paper tackles the problem of non-exemplar class-incremental learning, where models must learn new tasks without storing old data due to privacy constraints, by proposing CEAT, which achieves improvements of 5.38%, 5.20%, and 4.92% on CIFAR-100, TinyImageNet, and ImageNet-Subset benchmarks.

In real-world applications, dynamic scenarios require the models to possess the capability to learn new tasks continuously without forgetting the old knowledge. Experience-Replay methods store a subset of the old images for joint training. In the scenario of more strict privacy protection, storing the old images becomes infeasible, which leads to a more severe plasticity-stability dilemma and classifier bias. To meet the above challenges, we propose a new architecture, named continual expansion and absorption transformer~(CEAT). The model can learn the novel knowledge by extending the expanded-fusion layers in parallel with the frozen previous parameters. After the task ends, we losslessly absorb the extended parameters into the backbone to ensure that the number of parameters remains constant. To improve the learning ability of the model, we designed a novel prototype contrastive loss to reduce the overlap between old and new classes in the feature space. Besides, to address the classifier bias towards the new classes, we propose a novel approach to generate the pseudo-features to correct the classifier. We experiment with our methods on three standard Non-Exemplar Class-Incremental Learning~(NECIL) benchmarks. Extensive experiments demonstrate that our model gets a significant improvement compared with the previous works and achieves 5.38%, 5.20%, and 4.92% improvement on CIFAR-100, TinyImageNet, and ImageNet-Subset.

View on arXiv PDF

Similar