Intriguing Properties of Randomly Weighted Networks: Generalizing While Learning Next to Nothing
This work addresses the efficiency and robustness of neural network training for machine learning practitioners, but it is incremental as it builds on prior ideas like Extreme Learning Machines.
The paper tackles the problem of reducing the number of trainable parameters in deep neural networks by fixing most layers to random weights, and finds that this approach often achieves performance comparable to fully trained networks, with experiments showing on-par results.
Training deep neural networks results in strong learned representations that show good generalization capabilities. In most cases, training involves iterative modification of all weights inside the network via back-propagation. In Extreme Learning Machines, it has been suggested to set the first layer of a network to fixed random values instead of learning it. In this paper, we propose to take this approach a step further and fix almost all layers of a deep convolutional neural network, allowing only a small portion of the weights to be learned. As our experiments show, fixing even the majority of the parameters of the network often results in performance which is on par with the performance of learning all of them. The implications of this intriguing property of deep neural networks are discussed and we suggest ways to harness it to create more robust representations.