LGCVNEMLJul 2, 2020

Persistent Neurons

arXiv:2007.01419v2
AI Analysis

This addresses the challenge of finding better optima in neural network training, particularly under poor initializations, though it appears incremental as it builds on existing optimization methods.

The paper tackles the problem of neural networks being sensitive to initialization and data distribution by proposing persistent neurons, a trajectory-based optimization strategy that uses information from previous converged solutions to avoid bad local minima, resulting in improved performance across various neural network architectures like AlexNet and ResNet.

Neural networks (NN)-based learning algorithms are strongly affected by the choices of initialization and data distribution. Different optimization strategies have been proposed for improving the learning trajectory and finding a better optima. However, designing improved optimization strategies is a difficult task under the conventional landscape view. Here, we propose persistent neurons, a trajectory-based strategy that optimizes the learning task using information from previous converged solutions. More precisely, we utilize the end of trajectories and let the parameters explore new landscapes by penalizing the model from converging to the previous solutions under the same initialization. Persistent neurons can be regarded as a stochastic gradient method with informed bias where individual updates are corrupted by deterministic error terms. Specifically, we show that persistent neurons, under certain data distribution, is able to converge to more optimal solutions while initializations under popular framework find bad local minima. We further demonstrate that persistent neurons helps improve the model's performance under both good and poor initializations. We evaluate the full and partial persistent model and show it can be used to boost the performance on a range of NN structures, such as AlexNet and residual neural network (ResNet).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes