Spectral Collapse Drives Loss of Plasticity in Deep Continual Learning
This addresses the issue of catastrophic forgetting in continual learning for AI systems that need to adapt to new tasks over time, offering incremental improvements to existing methods.
The paper tackled the problem of loss of plasticity in deep continual learning, where neural networks fail to learn new tasks without reinitialization, and showed that this is preceded by Hessian spectral collapse; they introduced a framework for trainability and proposed regularization methods that effectively preserve plasticity in experiments.
We investigate why deep neural networks suffer from loss of plasticity in deep continual learning, failing to learn new tasks without reinitializing parameters. We show that this failure is preceded by Hessian spectral collapse at new-task initialization, where meaningful curvature directions vanish and gradient descent becomes ineffective. To characterize the necessary condition for successful training, we introduce the notion of $τ$-trainability and show that current plasticity preserving algorithms can be unified under this framework. Targeting spectral collapse directly, we then discuss the Kronecker factored approximation of the Hessian, which motivates two regularization enhancements: maintaining high effective feature rank and applying L2 penalties. Experiments on continual supervised and reinforcement learning tasks confirm that combining these two regularizers effectively preserves plasticity.