LGAIYesterday

Preserving Plasticity in Continual Learning via Dynamical Isometry

arXiv:2606.09762v114.2
Originality Incremental advance
AI Analysis

For researchers in continual learning, this work provides a theoretical understanding and practical methods to mitigate plasticity loss in deep neural networks.

The paper identifies dynamical isometry as a key mechanism for preserving plasticity in continual learning, and proposes an isometry-promoting regularization scheme and the AdamO optimizer. Their methods match or outperform existing approaches across supervised and reinforcement-learning benchmarks.

Continual training of deep neural networks under non-stationarity often leads to a progressive loss of plasticity, eventually limiting further learning. We relate plasticity to the empirical Neural Tangent Kernel, and identify dynamical isometry (the condition that layer-wise Jacobian singular values remain close to one) as a key mechanism for preserving plasticity in continual learning. We revisit a class of networks that are almost-everywhere isometric while remaining universal Lipschitz function approximators, demonstrating that near-dynamical isometry is compatible with expressive nonlinear representations. For general architectures, we propose an efficient isometry-promoting regularization scheme and identify a novel mechanism by which it can reactivate dormant ReLU units. Building on this, we introduce AdamO, an Adam-style adaptive optimizer that decouples isometry regularization from gradient updates, analogous to AdamW. We further reinterpret prior plasticity-preserving approaches through the lens of dynamical isometry, showing that they target only a partial measure of isometry. Across supervised and reinforcement-learning continual-learning benchmarks designed to induce plasticity loss, our methods consistently match or outperform existing approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes