Exchangeability and Kernel Invariance in Trained MLPs
This provides a theoretical insight for researchers analyzing trained neural networks, though it appears incremental as it relaxes existing assumptions without major practical breakthroughs.
The paper tackles the assumption of IID parameters in MLPs by showing that weights are exchangeable, leading to the result that layer-wise kernels remain approximately constant during training in certain cases, with a sharp change in network behavior as weight covariance shifts from zero.
In the analysis of machine learning models, it is often convenient to assume that the parameters are IID. This assumption is not satisfied when the parameters are updated through training processes such as SGD. A relaxation of the IID condition is a probabilistic symmetry known as exchangeability. We show the sense in which the weights in MLPs are exchangeable. This yields the result that in certain instances, the layer-wise kernel of fully-connected layers remains approximately constant during training. We identify a sharp change in the macroscopic behavior of networks as the covariance between weights changes from zero.