Randomly Initialized One-Layer Neural Networks Make Data Linearly Separable
This addresses the problem of computational efficiency in neural networks for researchers and practitioners by offering a theoretical guarantee for linear separability without training, though it is incremental as it builds on existing work on random networks.
The paper shows that a randomly initialized one-layer neural network with sufficient width can, with high probability, transform two arbitrary sets into linearly separable sets without training, providing bounds on the required width that overcome the curse of dimensionality.
Recently, neural networks have demonstrated remarkable capabilities in mapping two arbitrary sets to two linearly separable sets. The prospect of achieving this with randomly initialized neural networks is particularly appealing due to the computational efficiency compared to fully trained networks. This paper contributes by establishing that, given sufficient width, a randomly initialized one-layer neural network can, with high probability, transform two sets into two linearly separable sets without any training. Moreover, we furnish precise bounds on the necessary width of the neural network for this phenomenon to occur. Our initial bound exhibits exponential dependence on the input dimension while maintaining polynomial dependence on all other parameters. In contrast, our second bound is independent of input dimension, effectively surmounting the curse of dimensionality. The main tools used in our proof heavily relies on a fusion of geometric principles and concentration of random matrices.