LGCOGTJun 9, 2023

Hidden symmetries of ReLU networks

arXiv:2306.06179v137 citationsh-index: 23
Originality Incremental advance
AI Analysis

This work addresses the problem of understanding parameter-function mappings in neural networks for researchers in machine learning theory, but it is incremental as it builds on known symmetries.

The paper investigates the redundancy in parameter representations of ReLU neural networks, proving that for architectures where no layer is narrower than the input, parameter settings without hidden symmetries exist, and empirically showing that the probability of no hidden symmetries decreases with depth and increases with width and input dimension.

The parameter space for any fixed architecture of feedforward ReLU neural networks serves as a proxy during training for the associated class of functions - but how faithful is this representation? It is known that many different parameter settings can determine the same function. Moreover, the degree of this redundancy is inhomogeneous: for some networks, the only symmetries are permutation of neurons in a layer and positive scaling of parameters at a neuron, while other networks admit additional hidden symmetries. In this work, we prove that, for any network architecture where no layer is narrower than the input, there exist parameter settings with no hidden symmetries. We also describe a number of mechanisms through which hidden symmetries can arise, and empirically approximate the functional dimension of different network architectures at initialization. These experiments indicate that the probability that a network has no hidden symmetries decreases towards 0 as depth increases, while increasing towards 1 as width and input dimension increase.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes