LG MLNov 26, 2024

Integrating Dual Prototypes for Task-Wise Adaption in Pre-Trained Model-Based Class-Incremental Learning

Zhiming Xu, Suorong Yang, Baile Xu, Furao Shen, Jian Zhao

arXiv:2411.17766v34.61 citationsh-index: 19Has CodeNeural Networks

Originality Incremental advance

AI Analysis

This addresses the problem of knowledge retention in incremental learning for AI systems, but it is incremental as it builds on existing pre-trained model-based methods.

The paper tackles catastrophic forgetting in class-incremental learning by proposing DPTA, a method that uses dual prototypes and task-wise adapters to fine-tune pre-trained models, achieving excellent performance on benchmark datasets.

Class-incremental learning (CIL) aims to acquire new classes while conserving historical knowledge incrementally. Despite existing pre-trained model (PTM) based methods performing excellently in CIL, it is better to fine-tune them on downstream incremental tasks with massive patterns unknown to PTMs. However, using task streams for fine-tuning could lead to \textit{catastrophic forgetting} that will erase the knowledge in PTMs. This paper proposes the Dual Prototype network for Task-wise Adaption (DPTA) of PTM-based CIL. For each incremental learning task, an adapter module is built to fine-tune the PTM, where the center-adapt loss forces the representation to be more centrally clustered and class separable. The dual prototype network improves the prediction process by enabling test-time adapter selection, where the raw prototypes deduce several possible task indexes of test samples to select suitable adapter modules for PTM, and the augmented prototypes that could separate highly correlated classes are utilized to determine the final result. Experiments on several benchmark datasets demonstrate the excellent performance of DPTA. Code is available in https://github.com/Yorkxzm/DPTA

View on arXiv PDF Code

Similar