NEAIJul 31, 2025

Reinitializing weights vs units for maintaining plasticity in neural networks

arXiv:2508.00212v24 citationsh-index: 7
Originality Incremental advance
AI Analysis

This addresses a crucial issue for designing systems that learn continually, but it is incremental as it builds on existing reinitialization techniques.

The paper tackles the problem of loss of plasticity in neural networks during continual learning by comparing reinitializing units versus weights, finding that reinitializing weights is more effective in small networks or with layer normalization, while both methods are equally effective in large networks without normalization.

Loss of plasticity is a phenomenon in which a neural network loses its ability to learn when trained for an extended time on non-stationary data. It is a crucial problem to overcome when designing systems that learn continually. An effective technique for preventing loss of plasticity is reinitializing parts of the network. In this paper, we compare two different reinitialization schemes: reinitializing units vs reinitializing weights. We propose a new algorithm, which we name \textit{selective weight reinitialization}, for reinitializing the least useful weights in a network. We compare our algorithm to continual backpropagation and ReDo, two previously proposed algorithms that reinitialize units in the network. Through our experiments in continual supervised learning problems, we identify two settings when reinitializing weights is more effective at maintaining plasticity than reinitializing units: (1) when the network has a small number of units and (2) when the network includes layer normalization. Conversely, reinitializing weights and units are equally effective at maintaining plasticity when the network is of sufficient size and does not include layer normalization. We found that reinitializing weights maintains plasticity in a wider variety of settings than reinitializing units.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes