Shared-Weights Extender and Gradient Voting for Neural Network Expansion
This addresses a specific bottleneck in neural network expansion for machine learning practitioners, offering an incremental improvement over existing methods.
The paper tackled the problem of newly added neurons becoming inactive during neural network expansion, which limits capacity growth. The proposed Shared-Weights Extender and Steepest Voting Distributor methods effectively suppressed neuron inactivity and achieved better performance on four datasets compared to other expanding methods and baselines.
Expanding neural networks during training is a promising way to augment capacity without retraining larger models from scratch. However, newly added neurons often fail to adjust to a trained network and become inactive, providing no contribution to capacity growth. We propose the Shared-Weights Extender (SWE), a novel method explicitly designed to prevent inactivity of new neurons by coupling them with existing ones for smooth integration. In parallel, we introduce the Steepest Voting Distributor (SVoD), a gradient-based method for allocating neurons across layers during deep network expansion. Our extensive benchmarking on four datasets shows that our method can effectively suppress neuron inactivity and achieve better performance compared to other expanding methods and baselines.