LGCVSep 1, 2025

Forward-Only Continual Learning

arXiv:2509.01533v12 citationsh-index: 8MM
Originality Incremental advance
AI Analysis

This addresses the problem of computational inefficiency in continual learning for resource-constrained environments, offering an incremental improvement over existing methods.

The paper tackles catastrophic forgetting in continual learning with pre-trained models by proposing FoRo, a forward-only, gradient-free method that reduces average forgetting and improves accuracy while cutting memory usage and run time.

Catastrophic forgetting remains a central challenge in continual learning (CL) with pre-trained models. While existing approaches typically freeze the backbone and fine-tune a small number of parameters to mitigate forgetting, they still rely on iterative error backpropagation and gradient-based optimization, which can be computationally intensive and less suitable for resource-constrained environments. To address this, we propose FoRo, a forward-only, gradient-free continual learning method. FoRo consists of a lightweight prompt tuning strategy and a novel knowledge encoding mechanism, both designed without modifying the pre-trained model. Specifically, prompt embeddings are inserted at the input layer and optimized using the Covariance Matrix Adaptation Evolution Strategy (CMA-ES), which mitigates distribution shifts and extracts high-quality task representations. Subsequently, task-specific knowledge is encoded into a knowledge encoding matrix via nonlinear random projection and recursive least squares, enabling incremental updates to the classifier without revisiting prior data. Experiments show that FoRo significantly reduces average forgetting and improves accuracy. Thanks to forward-only learning, FoRo reduces memory usage and run time while maintaining high knowledge retention across long task sequences. These results suggest that FoRo could serve as a promising direction for exploring continual learning with pre-trained models, especially in real-world multimedia applications where both efficiency and effectiveness are critical.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes