CVIVApr 11, 2025

Enhancing knowledge retention for continual learning with domain-specific adapters and features gating

arXiv:2504.08613v1h-index: 5Applied intelligence (Boston)
Originality Incremental advance
AI Analysis

This addresses the problem of maintaining model performance across evolving data distributions for AI systems that learn continuously, though it appears incremental as it builds on existing parameter-efficient fine-tuning methods.

The paper tackles catastrophic forgetting in continual learning by integrating domain-specific adapters and feature gating into Vision Transformers, achieving improved knowledge retention when sequentially learning from multiple datasets like CIFAR-100, Flowers102, and DTD.

Continual learning empowers models to learn from a continuous stream of data while preserving previously acquired knowledge, effectively addressing the challenge of catastrophic forgetting. In this study, we propose a new approach that integrates adapters within the self-attention mechanisms of Vision Transformers to enhance knowledge retention when sequentially adding datasets from different domains. Unlike previous methods that continue learning with only one dataset, our approach introduces domain-specific output heads and feature gating, allowing the model to maintain high accuracy on previously learned tasks while incorporating only the essential information from multiple domains. The proposed method is compared to prominent parameter-efficient fine-tuning methods in the current state of the art. The results provide evidence that our method effectively alleviates the limitations of previous works. Furthermore, we conduct a comparative analysis using three datasets, CIFAR-100, Flowers102, and DTD, each representing a distinct domain, to investigate the impact of task order on model performance. Our findings underscore the critical role of dataset sequencing in shaping learning outcomes, demonstrating that strategic ordering can significantly improve the model's ability to adapt to evolving data distributions over time while preserving the integrity of previously learned knowledge.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes