Global Convergence of Continual Learning on Non-IID Data
This work addresses a fundamental limitation in continual learning theory for researchers and practitioners by extending analysis beyond restrictive i.i.d. assumptions.
The paper tackles the lack of theoretical guarantees for continual learning on non-i.i.d. data by establishing almost sure convergence results and providing convergence rates for forgetting and regret metrics without requiring excitation conditions.
Continual learning, which aims to learn multiple tasks sequentially, has gained extensive attention. However, most existing work focuses on empirical studies, and the theoretical aspect remains under-explored. Recently, a few investigations have considered the theory of continual learning only for linear regressions, establishes the results based on the strict independent and identically distributed (i.i.d.) assumption and the persistent excitation on the feature data that may be difficult to verify or guarantee in practice. To overcome this fundamental limitation, in this paper, we provide a general and comprehensive theoretical analysis for continual learning of regression models. By utilizing the stochastic Lyapunov function and martingale estimation techniques, we establish the almost sure convergence results of continual learning under a general data condition for the first time. Additionally, without any excitation condition imposed on the data, the convergence rates for the forgetting and regret metrics are provided.