Points of non-linearity of functions generated by random neural networks
This provides theoretical insights into neural network behavior for researchers in machine learning theory, though it is incremental as it builds on existing random network analysis.
The paper tackles the problem of understanding the geometry of functions generated by random neural networks with ReLU activations by computing the expected distribution of non-linearity points, and finds that this bias explains why networks may output simpler functions and struggle to approximate certain low-complexity functions.
We consider functions from the real numbers to the real numbers, output by a neural network with 1 hidden activation layer, arbitrary width, and ReLU activation function. We assume that the parameters of the neural network are chosen uniformly at random with respect to various probability distributions, and compute the expected distribution of the points of non-linearity. We use these results to explain why the network may be biased towards outputting functions with simpler geometry, and why certain functions with low information-theoretic complexity are nonetheless hard for a neural network to approximate.