High-dimensional Asymptotics of Generalization Performance in Continual Ridge Regression
This work provides foundational theoretical insights for continual learning, addressing a gap in understanding generalization performance, though it is incremental as it builds on existing ridge regression and random matrix theory frameworks.
The paper tackles the lack of theoretical understanding of generalization performance in continual learning by analyzing continual ridge regression in high-dimensional linear models, deriving exact asymptotic expressions for prediction risk and characterizing key evaluation metrics like average risk, backward transfer, and forward transfer.
Continual learning is motivated by the need to adapt to real-world dynamics in tasks and data distribution while mitigating catastrophic forgetting. Despite significant advances in continual learning techniques, the theoretical understanding of their generalization performance lags behind. This paper examines the theoretical properties of continual ridge regression in high-dimensional linear models, where the dimension is proportional to the sample size in each task. Using random matrix theory, we derive exact expressions of the asymptotic prediction risk, thereby enabling the characterization of three evaluation metrics of generalization performance in continual learning: average risk, backward transfer, and forward transfer. Furthermore, we present the theoretical risk curves to illustrate the trends in these evaluation metrics throughout the continual learning process. Our analysis reveals several intriguing phenomena in the risk curves, demonstrating how model specifications influence the generalization performance. Simulation studies are conducted to validate our theoretical findings.