BiRT: Bio-inspired Replay in Vision Transformers for Continual Learning
This work addresses the problem of catastrophic forgetting in continual learning for vision transformers, offering a memory-efficient and robust solution that is incremental in improving representation rehearsal.
The paper tackles catastrophic forgetting in continual learning for vision transformers by proposing BiRT, a bio-inspired representation rehearsal method that introduces constructive noises and enforces prediction consistency, achieving consistent performance gains over existing rehearsal approaches on challenging benchmarks.
The ability of deep neural networks to continually learn and adapt to a sequence of tasks has remained challenging due to catastrophic forgetting of previously learned tasks. Humans, on the other hand, have a remarkable ability to acquire, assimilate, and transfer knowledge across tasks throughout their lifetime without catastrophic forgetting. The versatility of the brain can be attributed to the rehearsal of abstract experiences through a complementary learning system. However, representation rehearsal in vision transformers lacks diversity, resulting in overfitting and consequently, performance drops significantly compared to raw image rehearsal. Therefore, we propose BiRT, a novel representation rehearsal-based continual learning approach using vision transformers. Specifically, we introduce constructive noises at various stages of the vision transformer and enforce consistency in predictions with respect to an exponential moving average of the working model. Our method provides consistent performance gain over raw image and vanilla representation rehearsal on several challenging CL benchmarks, while being memory efficient and robust to natural and adversarial corruptions.