LGAIMLApr 5, 2025

Memory-Statistics Tradeoff in Continual Learning with Structural Regularization

arXiv:2504.04039v13 citationsh-index: 9
Originality Incremental advance
AI Analysis

This work addresses catastrophic forgetting for machine learning systems that learn sequentially, providing theoretical insights but is incremental as it focuses on a specific two-task linear setting.

The paper tackles catastrophic forgetting in continual learning by analyzing a structural regularization algorithm for two linear regression tasks, establishing a trade-off where more regularization vectors improve excess risk but worsen memory complexity, achieving performance comparable to joint training.

We study the statistical performance of a continual learning problem with two linear regression tasks in a well-specified random design setting. We consider a structural regularization algorithm that incorporates a generalized $\ell_2$-regularization tailored to the Hessian of the previous task for mitigating catastrophic forgetting. We establish upper and lower bounds on the joint excess risk for this algorithm. Our analysis reveals a fundamental trade-off between memory complexity and statistical efficiency, where memory complexity is measured by the number of vectors needed to define the structural regularization. Specifically, increasing the number of vectors in structural regularization leads to a worse memory complexity but an improved excess risk, and vice versa. Furthermore, our theory suggests that naive continual learning without regularization suffers from catastrophic forgetting, while structural regularization mitigates this issue. Notably, structural regularization achieves comparable performance to joint training with access to both tasks simultaneously. These results highlight the critical role of curvature-aware regularization for continual learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes