LG DIS-NN OC MLMar 12, 2023

Phase Diagram of Initial Condensation for Two-layer Neural Networks

Zhengan Chen, Yuqing Li, Tao Luo, Zhangchen Zhou, Zhi-Qin John Xu

arXiv:2303.06561v217.014 citationsh-index: 15

Originality Synthesis-oriented

AI Analysis

This provides insights into the dynamical regimes of neural networks for deep learning researchers, but it is incremental as it builds on earlier work.

The paper tackles the problem of understanding distinct behaviors of neural networks under different initialization scales by presenting a phase diagram of initial condensation for two-layer networks, demonstrating how small initialization leads to condensation at the initial training stage.

The phenomenon of distinct behaviors exhibited by neural networks under varying scales of initialization remains an enigma in deep learning research. In this paper, based on the earlier work by Luo et al.~\cite{luo2021phase}, we present a phase diagram of initial condensation for two-layer neural networks. Condensation is a phenomenon wherein the weight vectors of neural networks concentrate on isolated orientations during the training process, and it is a feature in non-linear learning process that enables neural networks to possess better generalization abilities. Our phase diagram serves to provide a comprehensive understanding of the dynamical regimes of neural networks and their dependence on the choice of hyperparameters related to initialization. Furthermore, we demonstrate in detail the underlying mechanisms by which small initialization leads to condensation at the initial training stage.

View on arXiv PDF

Similar