LGMay 24, 2025

Exemplar-Free Continual Learning for State Space Models

Isaac Ning Lee, Leila Mahmoodi, Trung Le, Mehrtash Harandi

arXiv:2505.18604v14.1h-index: 29

Originality Incremental advance

AI Analysis

This addresses the problem of adapting State-Space Models to sequential tasks without storing prior data, which is crucial for applications requiring efficient continual learning, though it appears incremental as it builds on existing CL methods.

The paper tackles catastrophic forgetting in exemplar-free continual learning for State-Space Models by proposing Inf-SSM, a geometry-aware regularization method that constrains state evolution using the infinite-dimensional Grassmannian, resulting in a significant reduction in forgetting and improved accuracy on benchmarks like ImageNet-R and Caltech-256.

State-Space Models (SSMs) excel at capturing long-range dependencies with structured recurrence, making them well-suited for sequence modeling. However, their evolving internal states pose challenges in adapting them under Continual Learning (CL). This is particularly difficult in exemplar-free settings, where the absence of prior data leaves updates to the dynamic SSM states unconstrained, resulting in catastrophic forgetting. To address this, we propose Inf-SSM, a novel and simple geometry-aware regularization method that utilizes the geometry of the infinite-dimensional Grassmannian to constrain state evolution during CL. Unlike classical continual learning methods that constrain weight updates, Inf-SSM regularizes the infinite-horizon evolution of SSMs encoded in their extended observability subspace. We show that enforcing this regularization requires solving a matrix equation known as the Sylvester equation, which typically incurs $\mathcal{O}(n^3)$ complexity. We develop a $\mathcal{O}(n^2)$ solution by exploiting the structure and properties of SSMs. This leads to an efficient regularization mechanism that can be seamlessly integrated into existing CL methods. Comprehensive experiments on challenging benchmarks, including ImageNet-R and Caltech-256, demonstrate a significant reduction in forgetting while improving accuracy across sequential tasks.

View on arXiv PDF

Similar